diff mbox series

[2/2] vfio/mdev: don't warn if ->request is not set

Message ID 20210726143524.155779-3-hch@lst.de (mailing list archive)
State New, archived
Headers show
Series [1/2] vfio/mdev: turn mdev_init into a subsys_initcall | expand

Commit Message

Christoph Hellwig July 26, 2021, 2:35 p.m. UTC
Only a single driver actually sets the ->request method, so don't print
a scary warning if it isn't.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/vfio/mdev/mdev_core.c | 4 ----
 1 file changed, 4 deletions(-)

Comments

Cornelia Huck July 26, 2021, 5:07 p.m. UTC | #1
On Mon, Jul 26 2021, Christoph Hellwig <hch@lst.de> wrote:

> Only a single driver actually sets the ->request method, so don't print
> a scary warning if it isn't.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  drivers/vfio/mdev/mdev_core.c | 4 ----
>  1 file changed, 4 deletions(-)
>
> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
> index b16606ebafa1..b314101237fe 100644
> --- a/drivers/vfio/mdev/mdev_core.c
> +++ b/drivers/vfio/mdev/mdev_core.c
> @@ -138,10 +138,6 @@ int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops)
>  	if (!dev)
>  		return -EINVAL;
>  
> -	/* Not mandatory, but its absence could be a problem */
> -	if (!ops->request)
> -		dev_info(dev, "Driver cannot be asked to release device\n");
> -
>  	mutex_lock(&parent_list_lock);
>  
>  	/* Check for duplicate */

We also log a warning if we would like to call ->request() but none was
provided, so I think that's fine.

Reviewed-by: Cornelia Huck <cohuck@redhat.com>

But I wonder why nobody else implements this? Lack of surprise removal?
Jason Gunthorpe July 26, 2021, 11:07 p.m. UTC | #2
On Mon, Jul 26, 2021 at 04:35:24PM +0200, Christoph Hellwig wrote:
> Only a single driver actually sets the ->request method, so don't print
> a scary warning if it isn't.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  drivers/vfio/mdev/mdev_core.c | 4 ----
>  1 file changed, 4 deletions(-)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Jason Gunthorpe July 26, 2021, 11:09 p.m. UTC | #3
On Mon, Jul 26, 2021 at 07:07:04PM +0200, Cornelia Huck wrote:

> But I wonder why nobody else implements this? Lack of surprise removal?

The only implementation triggers an eventfd that seems to be the same
eventfd as the interrupt..

Do you know how this works in userspace? I'm surprised that the
interrupt eventfd can trigger an observation that the kernel driver
wants to be unplugged?

Jason
Alex Williamson July 26, 2021, 11:28 p.m. UTC | #4
On Mon, 26 Jul 2021 20:09:06 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Mon, Jul 26, 2021 at 07:07:04PM +0200, Cornelia Huck wrote:
> 
> > But I wonder why nobody else implements this? Lack of surprise removal?  
> 
> The only implementation triggers an eventfd that seems to be the same
> eventfd as the interrupt..
> 
> Do you know how this works in userspace? I'm surprised that the
> interrupt eventfd can trigger an observation that the kernel driver
> wants to be unplugged?

I think we're talking about ccw, but I see QEMU registering separate
eventfds for each of the 3 IRQ indexes and the mdev driver specifically
triggering the req_trigger...?  Thanks,

Alex
Cornelia Huck July 27, 2021, 6:04 a.m. UTC | #5
On Mon, Jul 26 2021, Alex Williamson <alex.williamson@redhat.com> wrote:

> On Mon, 26 Jul 2021 20:09:06 -0300
> Jason Gunthorpe <jgg@nvidia.com> wrote:
>
>> On Mon, Jul 26, 2021 at 07:07:04PM +0200, Cornelia Huck wrote:
>> 
>> > But I wonder why nobody else implements this? Lack of surprise removal?  
>> 
>> The only implementation triggers an eventfd that seems to be the same
>> eventfd as the interrupt..
>> 
>> Do you know how this works in userspace? I'm surprised that the
>> interrupt eventfd can trigger an observation that the kernel driver
>> wants to be unplugged?
>
> I think we're talking about ccw, but I see QEMU registering separate
> eventfds for each of the 3 IRQ indexes and the mdev driver specifically
> triggering the req_trigger...?  Thanks,
>
> Alex

Exactly, ccw has a trigger for normal I/O interrupts, CRW (machine
checks), and this one.
Jason Gunthorpe July 27, 2021, 5:32 p.m. UTC | #6
On Tue, Jul 27, 2021 at 08:04:16AM +0200, Cornelia Huck wrote:
> On Mon, Jul 26 2021, Alex Williamson <alex.williamson@redhat.com> wrote:
> 
> > On Mon, 26 Jul 2021 20:09:06 -0300
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> >> On Mon, Jul 26, 2021 at 07:07:04PM +0200, Cornelia Huck wrote:
> >> 
> >> > But I wonder why nobody else implements this? Lack of surprise removal?  
> >> 
> >> The only implementation triggers an eventfd that seems to be the same
> >> eventfd as the interrupt..
> >> 
> >> Do you know how this works in userspace? I'm surprised that the
> >> interrupt eventfd can trigger an observation that the kernel driver
> >> wants to be unplugged?
> >
> > I think we're talking about ccw, but I see QEMU registering separate
> > eventfds for each of the 3 IRQ indexes and the mdev driver specifically
> > triggering the req_trigger...?  Thanks,
> >
> > Alex
> 
> Exactly, ccw has a trigger for normal I/O interrupts, CRW (machine
> checks), and this one.

If it is a dedicated eventfd for 'device being removed' why is it in
the CCW implementation and not core code?

Is PCI doing the same?

Jason
Alex Williamson July 27, 2021, 6:53 p.m. UTC | #7
On Tue, 27 Jul 2021 14:32:09 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, Jul 27, 2021 at 08:04:16AM +0200, Cornelia Huck wrote:
> > On Mon, Jul 26 2021, Alex Williamson <alex.williamson@redhat.com> wrote:
> >   
> > > On Mon, 26 Jul 2021 20:09:06 -0300
> > > Jason Gunthorpe <jgg@nvidia.com> wrote:
> > >  
> > >> On Mon, Jul 26, 2021 at 07:07:04PM +0200, Cornelia Huck wrote:
> > >>   
> > >> > But I wonder why nobody else implements this? Lack of surprise removal?    
> > >> 
> > >> The only implementation triggers an eventfd that seems to be the same
> > >> eventfd as the interrupt..
> > >> 
> > >> Do you know how this works in userspace? I'm surprised that the
> > >> interrupt eventfd can trigger an observation that the kernel driver
> > >> wants to be unplugged?  
> > >
> > > I think we're talking about ccw, but I see QEMU registering separate
> > > eventfds for each of the 3 IRQ indexes and the mdev driver specifically
> > > triggering the req_trigger...?  Thanks,
> > >
> > > Alex  
> > 
> > Exactly, ccw has a trigger for normal I/O interrupts, CRW (machine
> > checks), and this one.  
> 
> If it is a dedicated eventfd for 'device being removed' why is it in
> the CCW implementation and not core code?

The CCW implementation (likewise the vfio-pci implementation) owns the
IRQ index address space and the decision to make this a signal to
userspace rather than perhaps some handling a device might be able to
do internally.  For instance an alternate vfio-pci implementation might
zap all mmaps, block all r/w access, and turn this into a surprise
removal.  Another implementation might be more aggressive to sending
SIGKILL to the user process.  This was the thought behind why vfio-core
triggers the driver request callback with a counter, leaving the policy
to the driver.

> Is PCI doing the same?

Yes, that's where this handling originated.  Thanks,

Alex
Jason Gunthorpe July 27, 2021, 7:03 p.m. UTC | #8
On Tue, Jul 27, 2021 at 12:53:09PM -0600, Alex Williamson wrote:
> On Tue, 27 Jul 2021 14:32:09 -0300
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > On Tue, Jul 27, 2021 at 08:04:16AM +0200, Cornelia Huck wrote:
> > > On Mon, Jul 26 2021, Alex Williamson <alex.williamson@redhat.com> wrote:
> > >   
> > > > On Mon, 26 Jul 2021 20:09:06 -0300
> > > > Jason Gunthorpe <jgg@nvidia.com> wrote:
> > > >  
> > > >> On Mon, Jul 26, 2021 at 07:07:04PM +0200, Cornelia Huck wrote:
> > > >>   
> > > >> > But I wonder why nobody else implements this? Lack of surprise removal?    
> > > >> 
> > > >> The only implementation triggers an eventfd that seems to be the same
> > > >> eventfd as the interrupt..
> > > >> 
> > > >> Do you know how this works in userspace? I'm surprised that the
> > > >> interrupt eventfd can trigger an observation that the kernel driver
> > > >> wants to be unplugged?  
> > > >
> > > > I think we're talking about ccw, but I see QEMU registering separate
> > > > eventfds for each of the 3 IRQ indexes and the mdev driver specifically
> > > > triggering the req_trigger...?  Thanks,
> > > >
> > > > Alex  
> > > 
> > > Exactly, ccw has a trigger for normal I/O interrupts, CRW (machine
> > > checks), and this one.  
> > 
> > If it is a dedicated eventfd for 'device being removed' why is it in
> > the CCW implementation and not core code?
> 
> The CCW implementation (likewise the vfio-pci implementation) owns
> the IRQ index address space and the decision to make this a signal
> to userspace rather than perhaps some handling a device might be
> able to do internally. 

The core code holds the vfio_device_get() so long as the FD is
open. There is no way to pass the wait_for_completion without
userspace closing the FD, so there isn't really much choice for the
drivers to do beyond signal to userpace to close the FD??

> For instance an alternate vfio-pci implementation might zap all
> mmaps, block all r/w access, and turn this into a surprise removal.

This is nice, but wouldn't close the FD, so needs core changes
anyhow..

> Another implementation might be more aggressive to sending SIGKILL
> to the user process.

We don't try to revoke FDs from the kernel, it is racy, dangerous and
unreliable.

> This was the thought behind why vfio-core triggers the driver
> request callback with a counter, leaving the policy to the driver.

IMHO subsystem policy does not belong in drivers. Down that road lies
a mess for userspace.

Jason
Alex Williamson July 27, 2021, 7:25 p.m. UTC | #9
On Tue, 27 Jul 2021 16:03:17 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, Jul 27, 2021 at 12:53:09PM -0600, Alex Williamson wrote:
> > On Tue, 27 Jul 2021 14:32:09 -0300
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >   
> > > On Tue, Jul 27, 2021 at 08:04:16AM +0200, Cornelia Huck wrote:  
> > > > On Mon, Jul 26 2021, Alex Williamson <alex.williamson@redhat.com> wrote:
> > > >     
> > > > > On Mon, 26 Jul 2021 20:09:06 -0300
> > > > > Jason Gunthorpe <jgg@nvidia.com> wrote:
> > > > >    
> > > > >> On Mon, Jul 26, 2021 at 07:07:04PM +0200, Cornelia Huck wrote:
> > > > >>     
> > > > >> > But I wonder why nobody else implements this? Lack of surprise removal?      
> > > > >> 
> > > > >> The only implementation triggers an eventfd that seems to be the same
> > > > >> eventfd as the interrupt..
> > > > >> 
> > > > >> Do you know how this works in userspace? I'm surprised that the
> > > > >> interrupt eventfd can trigger an observation that the kernel driver
> > > > >> wants to be unplugged?    
> > > > >
> > > > > I think we're talking about ccw, but I see QEMU registering separate
> > > > > eventfds for each of the 3 IRQ indexes and the mdev driver specifically
> > > > > triggering the req_trigger...?  Thanks,
> > > > >
> > > > > Alex    
> > > > 
> > > > Exactly, ccw has a trigger for normal I/O interrupts, CRW (machine
> > > > checks), and this one.    
> > > 
> > > If it is a dedicated eventfd for 'device being removed' why is it in
> > > the CCW implementation and not core code?  
> > 
> > The CCW implementation (likewise the vfio-pci implementation) owns
> > the IRQ index address space and the decision to make this a signal
> > to userspace rather than perhaps some handling a device might be
> > able to do internally.   
> 
> The core code holds the vfio_device_get() so long as the FD is
> open. There is no way to pass the wait_for_completion without
> userspace closing the FD, so there isn't really much choice for the
> drivers to do beyond signal to userpace to close the FD??
> 
> > For instance an alternate vfio-pci implementation might zap all
> > mmaps, block all r/w access, and turn this into a surprise removal.  
> 
> This is nice, but wouldn't close the FD, so needs core changes
> anyhow..

Right, the core would need to be able to handle an FD disconnected from
the device, obviously some core changes would be required.

> > Another implementation might be more aggressive to sending SIGKILL
> > to the user process.  
> 
> We don't try to revoke FDs from the kernel, it is racy, dangerous and
> unreliable.

I'm not sure how trying to kill the process using an open file becomes
a revoke...  In fact, the surprise hotplug might just be able to zap
mmaps and wait for userspace to generate a SIGBUS.

> > This was the thought behind why vfio-core triggers the driver
> > request callback with a counter, leaving the policy to the driver.  
> 
> IMHO subsystem policy does not belong in drivers. Down that road lies
> a mess for userspace.

I think my argument was that to this point it's been driver policy, not
subsystem policy.  The subsystem policy is to block until the device is
released, it's the driver policy whether it has a means to implement
something to expedite that.  Thanks,

Alex
diff mbox series

Patch

diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
index b16606ebafa1..b314101237fe 100644
--- a/drivers/vfio/mdev/mdev_core.c
+++ b/drivers/vfio/mdev/mdev_core.c
@@ -138,10 +138,6 @@  int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops)
 	if (!dev)
 		return -EINVAL;
 
-	/* Not mandatory, but its absence could be a problem */
-	if (!ops->request)
-		dev_info(dev, "Driver cannot be asked to release device\n");
-
 	mutex_lock(&parent_list_lock);
 
 	/* Check for duplicate */