mbox series

[0/1] Reenable runtime PM for dynamically unbound devices

Message ID 20240319120410.1477713-1-mike.malyshev@gmail.com (mailing list archive)
Headers show
Series Reenable runtime PM for dynamically unbound devices | expand

Message

Mikhail Malyshev March 19, 2024, 12:04 p.m. UTC
When trying to run a VM with PCI passthrough of intel-eth-pci ETH device
QEMU fails with "Permission denied" error. This happens only if
intel-eth-pci driver is dynamically unbound from the device using
"echo -n $DEV > /sys/bus/pci/drivers/stmmac/unbind" command. If
"vfio-pci.ids=..." is used to bind the device to vfio-pci driver and the
device is never probed by intel-eth-pci driver the problem does not occur.

When intel-eth-pci driver is dynamically unbound from the device
.remove()
  intel_eth_pci_remove()
    stmmac_dvr_remove()
      pm_runtime_disable();

Later when QEMU tries to get the device file descriptor by calling
VFIO_GROUP_GET_DEVICE_FD ioctl pm_runtime_resume_and_get returns -EACCES.
It happens because dev->power.disable_depth == 1 .

vfio_group_fops_unl_ioctl(VFIO_GROUP_GET_DEVICE_FD)
  vfio_group_ioctl_get_device_fd()
    vfio_device_open()
      ret = device->ops->open_device()
        vfio_pci_open_device()
          vfio_pci_core_enable()
              ret = pm_runtime_resume_and_get();

This behavior was introduced by
commit 7ab5e10eda02 ("vfio/pci: Move the unused device into low power state with runtime PM")

This may be the case for any driver calling pm_runtime_disable() in its
.remove() callback.

The case when a runtime PM may be disable for a device is not handled so we
call pm_runtime_enable() in vfio_pci_core_register_device to re-enable it.

Mikhail Malyshev (1):
  vfio/pci: Reenable runtime PM for dynamically unbound devices

 drivers/vfio/pci/vfio_pci_core.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

--
2.34.1

Comments

Alex Williamson March 19, 2024, 2:50 p.m. UTC | #1
On Tue, 19 Mar 2024 12:04:09 +0000
Mikhail Malyshev <mike.malyshev@gmail.com> wrote:

> When trying to run a VM with PCI passthrough of intel-eth-pci ETH device
> QEMU fails with "Permission denied" error. This happens only if
> intel-eth-pci driver is dynamically unbound from the device using
> "echo -n $DEV > /sys/bus/pci/drivers/stmmac/unbind" command. If
> "vfio-pci.ids=..." is used to bind the device to vfio-pci driver and the
> device is never probed by intel-eth-pci driver the problem does not occur.
> 
> When intel-eth-pci driver is dynamically unbound from the device
> .remove()
>   intel_eth_pci_remove()
>     stmmac_dvr_remove()
>       pm_runtime_disable();

Why isn't the issue in intel-eth-pci?

For example stmmac_dvr_remove() does indeed call pm_runtime_disable()
unconditionally, but stmmac_dvr_probe() only conditionally calls
pm_runtime_enable() with logic like proposed here for vfio-pci.  Isn't
it this conditional enabling which causes an unbalanced disable depth
that's the core of the problem?

It doesn't seem like it should be the responsibility of the next driver
to correct the state from the previous driver.  You've indicated that
the device works with vfio-pci if there's no previous driver, so
clearly intel-eth-pci isn't leaving the device in the same runtime pm
state that it found it.  Thanks,

Alex

> Later when QEMU tries to get the device file descriptor by calling
> VFIO_GROUP_GET_DEVICE_FD ioctl pm_runtime_resume_and_get returns -EACCES.
> It happens because dev->power.disable_depth == 1 .
> 
> vfio_group_fops_unl_ioctl(VFIO_GROUP_GET_DEVICE_FD)
>   vfio_group_ioctl_get_device_fd()
>     vfio_device_open()
>       ret = device->ops->open_device()
>         vfio_pci_open_device()
>           vfio_pci_core_enable()
>               ret = pm_runtime_resume_and_get();
> 
> This behavior was introduced by
> commit 7ab5e10eda02 ("vfio/pci: Move the unused device into low power state with runtime PM")
> 
> This may be the case for any driver calling pm_runtime_disable() in its
> .remove() callback.
> 
> The case when a runtime PM may be disable for a device is not handled so we
> call pm_runtime_enable() in vfio_pci_core_register_device to re-enable it.
> 
> Mikhail Malyshev (1):
>   vfio/pci: Reenable runtime PM for dynamically unbound devices
> 
>  drivers/vfio/pci/vfio_pci_core.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> --
> 2.34.1
>
Mikhail Malyshev March 21, 2024, 12:23 p.m. UTC | #2
On Tue, Mar 19, 2024 at 08:50:11AM -0600, Alex Williamson wrote:
> On Tue, 19 Mar 2024 12:04:09 +0000
> Mikhail Malyshev <mike.malyshev@gmail.com> wrote:
> 
> > When trying to run a VM with PCI passthrough of intel-eth-pci ETH device
> > QEMU fails with "Permission denied" error. This happens only if
> > intel-eth-pci driver is dynamically unbound from the device using
> > "echo -n $DEV > /sys/bus/pci/drivers/stmmac/unbind" command. If
> > "vfio-pci.ids=..." is used to bind the device to vfio-pci driver and the
> > device is never probed by intel-eth-pci driver the problem does not occur.
> > 
> > When intel-eth-pci driver is dynamically unbound from the device
> > .remove()
> >   intel_eth_pci_remove()
> >     stmmac_dvr_remove()
> >       pm_runtime_disable();
> 
> Why isn't the issue in intel-eth-pci?
> 
> For example stmmac_dvr_remove() does indeed call pm_runtime_disable()
> unconditionally, but stmmac_dvr_probe() only conditionally calls
> pm_runtime_enable() with logic like proposed here for vfio-pci.  Isn't
> it this conditional enabling which causes an unbalanced disable depth
> that's the core of the problem?
> 
The common code in the stmmac driver is used for both PCI and non-PCI
drivers and this code doen't handle this correctly. That condition is
actually wrong

> It doesn't seem like it should be the responsibility of the next driver
> to correct the state from the previous driver.  You've indicated that
> the device works with vfio-pci if there's no previous driver, so
> clearly intel-eth-pci isn't leaving the device in the same runtime pm
> state that it found it.  Thanks,
yes, I agree. I was confused by a number of driver calling
pm_runtime_disabe in their remove() function but those are not PCI
drivers. Unfortunataly runtime PM documentation is not very clear on
this topic. I'll submit another patch for the driver. Are there any
subsystems other than PCI that call pm_runtime_enable/disable? Right now
my patch for the driver do not call them only for PCI case.

BR,
Mikhail
> 
> Alex
> 
> > Later when QEMU tries to get the device file descriptor by calling
> > VFIO_GROUP_GET_DEVICE_FD ioctl pm_runtime_resume_and_get returns -EACCES.
> > It happens because dev->power.disable_depth == 1 .
> > 
> > vfio_group_fops_unl_ioctl(VFIO_GROUP_GET_DEVICE_FD)
> >   vfio_group_ioctl_get_device_fd()
> >     vfio_device_open()
> >       ret = device->ops->open_device()
> >         vfio_pci_open_device()
> >           vfio_pci_core_enable()
> >               ret = pm_runtime_resume_and_get();
> > 
> > This behavior was introduced by
> > commit 7ab5e10eda02 ("vfio/pci: Move the unused device into low power state with runtime PM")
> > 
> > This may be the case for any driver calling pm_runtime_disable() in its
> > .remove() callback.
> > 
> > The case when a runtime PM may be disable for a device is not handled so we
> > call pm_runtime_enable() in vfio_pci_core_register_device to re-enable it.
> > 
> > Mikhail Malyshev (1):
> >   vfio/pci: Reenable runtime PM for dynamically unbound devices
> > 
> >  drivers/vfio/pci/vfio_pci_core.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> > 
> > --
> > 2.34.1
> > 
>