Message ID | 20230629084042.86502-5-zhenzhong.duan@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | VFIO migration related refactor and bug fix | expand |
On 29/06/2023 09:40, Zhenzhong Duan wrote: > When vfio_realize() succeeds, hot unplug will call vfio_exitfn() > to free resources allocated in vfio_realize(); when vfio_realize() > fails, vfio_exitfn() is never called and we need to free resources > in vfio_realize(). > > In the case that vfio_migration_realize() fails, > e.g: with -only-migratable & enable-migration=off, we see below: > > (qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,enable-migration=off > 0000:81:11.1: Migration disabled > Error: disallowing migration blocker (--only-migratable) for: 0000:81:11.1: Migration is disabled for VFIO device > > If we hotplug again we should see same log as above, but we see: > (qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,enable-migration=off > Error: vfio 0000:81:11.1: device is already attached > > That's because some references to VFIO device isn't released, > we should check return value of vfio_migration_realize() and > release the references, then VFIO device will be truely > released when hotplug fails. > > Fixes: a22651053b59 ("vfio: Make vfio-pci device migration capable") > Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> > --- > hw/vfio/pci.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c > index 54a8179d1c64..dc69d3031b24 100644 > --- a/hw/vfio/pci.c > +++ b/hw/vfio/pci.c > @@ -3210,6 +3210,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) > ret = vfio_migration_realize(vbasedev, errp); > if (ret) { > error_report("%s: Migration disabled", vbasedev->name); > + goto out_vfio_migration; > } > } > > @@ -3219,6 +3220,8 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) > > return; > > +out_vfio_migration: > + vfio_migration_exit(vbasedev); > out_deregister: > vfio_disable_interrupts(vdev); > out_intx_disable: I agree with the general sentiment behind the change. Clearly vfio::migration and vfio::migration_blocker are leaking from inside the migration_realize() function. But it is kinda awkward semantic that vfio_migration_realize() (or any realize) failures need to be accompanied with a vfio_migration_exit() that tears down state *leaked* by its realize() failure. It sounds to me that this should be inside the vfio_migration_realize() not on the caller? Unless QEMU ::realize() is expected to do this.
On 6/29/23 13:45, Joao Martins wrote: > On 29/06/2023 09:40, Zhenzhong Duan wrote: >> When vfio_realize() succeeds, hot unplug will call vfio_exitfn() >> to free resources allocated in vfio_realize(); when vfio_realize() >> fails, vfio_exitfn() is never called and we need to free resources >> in vfio_realize(). >> >> In the case that vfio_migration_realize() fails, >> e.g: with -only-migratable & enable-migration=off, we see below: >> >> (qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,enable-migration=off >> 0000:81:11.1: Migration disabled >> Error: disallowing migration blocker (--only-migratable) for: 0000:81:11.1: Migration is disabled for VFIO device >> >> If we hotplug again we should see same log as above, but we see: >> (qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,enable-migration=off >> Error: vfio 0000:81:11.1: device is already attached >> >> That's because some references to VFIO device isn't released, >> we should check return value of vfio_migration_realize() and >> release the references, then VFIO device will be truely >> released when hotplug fails. >> >> Fixes: a22651053b59 ("vfio: Make vfio-pci device migration capable") >> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> >> --- >> hw/vfio/pci.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c >> index 54a8179d1c64..dc69d3031b24 100644 >> --- a/hw/vfio/pci.c >> +++ b/hw/vfio/pci.c >> @@ -3210,6 +3210,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) >> ret = vfio_migration_realize(vbasedev, errp); >> if (ret) { >> error_report("%s: Migration disabled", vbasedev->name); >> + goto out_vfio_migration; >> } >> } >> >> @@ -3219,6 +3220,8 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) >> >> return; >> >> +out_vfio_migration: >> + vfio_migration_exit(vbasedev); >> out_deregister: >> vfio_disable_interrupts(vdev); >> out_intx_disable: > > I agree with the general sentiment behind the change. > Clearly vfio::migration and vfio::migration_blocker are leaking from inside the > migration_realize() function. > > But it is kinda awkward semantic that vfio_migration_realize() (or any realize) > failures need to be accompanied with a vfio_migration_exit() that tears down > state *leaked* by its realize() failure. > > It sounds to me that this should be inside the vfio_migration_realize() not on > the caller? Unless QEMU ::realize() is expected to do this. > I agree. vfio_migration_realize() should handle the cleanup of the resources it allocated if there is a failure. Thanks C.
>-----Original Message----- >From: Joao Martins <joao.m.martins@oracle.com> >Subject: Re: [PATCH v4 4/5] vfio/pci: Free resources when >vfio_migration_realize fails > >On 29/06/2023 09:40, Zhenzhong Duan wrote: >> When vfio_realize() succeeds, hot unplug will call vfio_exitfn() to >> free resources allocated in vfio_realize(); when vfio_realize() fails, >> vfio_exitfn() is never called and we need to free resources in >> vfio_realize(). >> >> In the case that vfio_migration_realize() fails, >> e.g: with -only-migratable & enable-migration=off, we see below: >> >> (qemu) device_add >> vfio-pci,host=81:11.1,id=vfio1,bus=root1,enable-migration=off >> 0000:81:11.1: Migration disabled >> Error: disallowing migration blocker (--only-migratable) for: >> 0000:81:11.1: Migration is disabled for VFIO device >> >> If we hotplug again we should see same log as above, but we see: >> (qemu) device_add >> vfio-pci,host=81:11.1,id=vfio1,bus=root1,enable-migration=off >> Error: vfio 0000:81:11.1: device is already attached >> >> That's because some references to VFIO device isn't released, we >> should check return value of vfio_migration_realize() and release the >> references, then VFIO device will be truely released when hotplug >> fails. >> >> Fixes: a22651053b59 ("vfio: Make vfio-pci device migration capable") >> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> >> --- >> hw/vfio/pci.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index >> 54a8179d1c64..dc69d3031b24 100644 >> --- a/hw/vfio/pci.c >> +++ b/hw/vfio/pci.c >> @@ -3210,6 +3210,7 @@ static void vfio_realize(PCIDevice *pdev, Error >**errp) >> ret = vfio_migration_realize(vbasedev, errp); >> if (ret) { >> error_report("%s: Migration disabled", vbasedev->name); >> + goto out_vfio_migration; >> } >> } >> >> @@ -3219,6 +3220,8 @@ static void vfio_realize(PCIDevice *pdev, Error >> **errp) >> >> return; >> >> +out_vfio_migration: >> + vfio_migration_exit(vbasedev); >> out_deregister: >> vfio_disable_interrupts(vdev); >> out_intx_disable: > >I agree with the general sentiment behind the change. >Clearly vfio::migration and vfio::migration_blocker are leaking from inside the >migration_realize() function. > >But it is kinda awkward semantic that vfio_migration_realize() (or any realize) >failures need to be accompanied with a vfio_migration_exit() that tears down >state *leaked* by its realize() failure. > >It sounds to me that this should be inside the vfio_migration_realize() not on >the caller? Unless QEMU ::realize() is expected to do this. Good suggestion, will fix. Thanks Zhenzhong
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 54a8179d1c64..dc69d3031b24 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3210,6 +3210,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) ret = vfio_migration_realize(vbasedev, errp); if (ret) { error_report("%s: Migration disabled", vbasedev->name); + goto out_vfio_migration; } } @@ -3219,6 +3220,8 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) return; +out_vfio_migration: + vfio_migration_exit(vbasedev); out_deregister: vfio_disable_interrupts(vdev); out_intx_disable:
When vfio_realize() succeeds, hot unplug will call vfio_exitfn() to free resources allocated in vfio_realize(); when vfio_realize() fails, vfio_exitfn() is never called and we need to free resources in vfio_realize(). In the case that vfio_migration_realize() fails, e.g: with -only-migratable & enable-migration=off, we see below: (qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,enable-migration=off 0000:81:11.1: Migration disabled Error: disallowing migration blocker (--only-migratable) for: 0000:81:11.1: Migration is disabled for VFIO device If we hotplug again we should see same log as above, but we see: (qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,enable-migration=off Error: vfio 0000:81:11.1: device is already attached That's because some references to VFIO device isn't released, we should check return value of vfio_migration_realize() and release the references, then VFIO device will be truely released when hotplug fails. Fixes: a22651053b59 ("vfio: Make vfio-pci device migration capable") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> --- hw/vfio/pci.c | 3 +++ 1 file changed, 3 insertions(+)