Message ID | 20230117134942.101112-6-yi.l.liu@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add vfio_device cdev for iommufd support | expand |
> From: Liu, Yi L <yi.l.liu@intel.com> > Sent: Tuesday, January 17, 2023 9:50 PM > > @@ -364,7 +364,7 @@ static int kvm_vfio_create(struct kvm_device *dev, > u32 type); > static struct kvm_device_ops kvm_vfio_ops = { > .name = "kvm-vfio", > .create = kvm_vfio_create, > - .destroy = kvm_vfio_destroy, > + .release = kvm_vfio_destroy, Also rename to kvm_vfio_release.
Hi Yi, On 1/17/23 14:49, Yi Liu wrote: > This is to avoid a circular refcount problem between the kvm struct and > the device file. KVM modules holds device/group file reference when the > device/group is added and releases it per removal or the last kvm reference > is released. This reference model is ok for the group since there is no > kvm reference in the group paths. > > But it is a problem for device file since the vfio devices may get kvm > reference in the device open path and put it in the device file release. > e.g. Intel kvmgt. This would result in a circular issue since the kvm > side won't put the device file reference if kvm reference is not 0, while > the vfio device side needs to put kvm reference in the release callback. > > To solve this problem for device file, let vfio provide release() which > would be called once kvm file is closed, it won't depend on the last kvm > reference. Hence avoid circular refcount problem. > > Suggested-by: Kevin Tian <kevin.tian@intel.com> > Signed-off-by: Yi Liu <yi.l.liu@intel.com> > --- > virt/kvm/vfio.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c > index 0f54b9d308d7..525efe37ab6d 100644 > --- a/virt/kvm/vfio.c > +++ b/virt/kvm/vfio.c > @@ -364,7 +364,7 @@ static int kvm_vfio_create(struct kvm_device *dev, u32 type); > static struct kvm_device_ops kvm_vfio_ops = { > .name = "kvm-vfio", > .create = kvm_vfio_create, > - .destroy = kvm_vfio_destroy, Is it safe to simply remove the destroy cb as it is called from kvm_destroy_vm/kvm_destroy_devices? Thanks Eric > + .release = kvm_vfio_destroy, > .set_attr = kvm_vfio_set_attr, > .has_attr = kvm_vfio_has_attr, > };
> From: Eric Auger <eric.auger@redhat.com> > Sent: Thursday, January 19, 2023 5:13 PM > > Hi Yi, > > On 1/17/23 14:49, Yi Liu wrote: > > This is to avoid a circular refcount problem between the kvm struct and > > the device file. KVM modules holds device/group file reference when the > > device/group is added and releases it per removal or the last kvm reference > > is released. This reference model is ok for the group since there is no > > kvm reference in the group paths. > > > > But it is a problem for device file since the vfio devices may get kvm > > reference in the device open path and put it in the device file release. > > e.g. Intel kvmgt. This would result in a circular issue since the kvm > > side won't put the device file reference if kvm reference is not 0, while > > the vfio device side needs to put kvm reference in the release callback. > > > > To solve this problem for device file, let vfio provide release() which > > would be called once kvm file is closed, it won't depend on the last kvm > > reference. Hence avoid circular refcount problem. > > > > Suggested-by: Kevin Tian <kevin.tian@intel.com> > > Signed-off-by: Yi Liu <yi.l.liu@intel.com> > > --- > > virt/kvm/vfio.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c > > index 0f54b9d308d7..525efe37ab6d 100644 > > --- a/virt/kvm/vfio.c > > +++ b/virt/kvm/vfio.c > > @@ -364,7 +364,7 @@ static int kvm_vfio_create(struct kvm_device *dev, > u32 type); > > static struct kvm_device_ops kvm_vfio_ops = { > > .name = "kvm-vfio", > > .create = kvm_vfio_create, > > - .destroy = kvm_vfio_destroy, > Is it safe to simply remove the destroy cb as it is called from > kvm_destroy_vm/kvm_destroy_devices? > According to the definition .release is considered as an alternative method to free the device: /* * Destroy is responsible for freeing dev. * * Destroy may be called before or after destructors are called * on emulated I/O regions, depending on whether a reference is * held by a vcpu or other kvm component that gets destroyed * after the emulated I/O. */ void (*destroy)(struct kvm_device *dev); /* * Release is an alternative method to free the device. It is * called when the device file descriptor is closed. Once * release is called, the destroy method will not be called * anymore as the device is removed from the device list of * the VM. kvm->lock is held. */ void (*release)(struct kvm_device *dev); Did you see any specific problem of moving this stuff to release?
On Tue, Jan 17, 2023 at 05:49:34AM -0800, Yi Liu wrote: > This is to avoid a circular refcount problem between the kvm struct and > the device file. KVM modules holds device/group file reference when the > device/group is added and releases it per removal or the last kvm reference > is released. This reference model is ok for the group since there is no > kvm reference in the group paths. > > But it is a problem for device file since the vfio devices may get kvm > reference in the device open path and put it in the device file release. > e.g. Intel kvmgt. This would result in a circular issue since the kvm > side won't put the device file reference if kvm reference is not 0, while > the vfio device side needs to put kvm reference in the release callback. > > To solve this problem for device file, let vfio provide release() which > would be called once kvm file is closed, it won't depend on the last kvm > reference. Hence avoid circular refcount problem. > > Suggested-by: Kevin Tian <kevin.tian@intel.com> > Signed-off-by: Yi Liu <yi.l.liu@intel.com> > --- > virt/kvm/vfio.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> From Alex's remarks please revise the commit message and add a Fixes line of some kind that this solves the deadlock Matthew was working on, and send it stand alone right away Jason
On Thu, 19 Jan 2023 15:07:01 -0400 Jason Gunthorpe <jgg@nvidia.com> wrote: > On Tue, Jan 17, 2023 at 05:49:34AM -0800, Yi Liu wrote: > > This is to avoid a circular refcount problem between the kvm struct and > > the device file. KVM modules holds device/group file reference when the > > device/group is added and releases it per removal or the last kvm reference > > is released. This reference model is ok for the group since there is no > > kvm reference in the group paths. > > > > But it is a problem for device file since the vfio devices may get kvm > > reference in the device open path and put it in the device file release. > > e.g. Intel kvmgt. This would result in a circular issue since the kvm > > side won't put the device file reference if kvm reference is not 0, while > > the vfio device side needs to put kvm reference in the release callback. > > > > To solve this problem for device file, let vfio provide release() which > > would be called once kvm file is closed, it won't depend on the last kvm > > reference. Hence avoid circular refcount problem. > > > > Suggested-by: Kevin Tian <kevin.tian@intel.com> > > Signed-off-by: Yi Liu <yi.l.liu@intel.com> > > --- > > virt/kvm/vfio.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> > > From Alex's remarks please revise the commit message and add a Fixes > line of some kind that this solves the deadlock Matthew was working > on, and send it stand alone right away Also revise the commit log since we'll be taking a reference in the group model as well. The function and comments should also be updated s/destroy/release/. Thanks, Alex
> From: Tian, Kevin <kevin.tian@intel.com> > Sent: Thursday, January 19, 2023 5:30 PM > > > From: Eric Auger <eric.auger@redhat.com> > > Sent: Thursday, January 19, 2023 5:13 PM > > > > Hi Yi, > > > > On 1/17/23 14:49, Yi Liu wrote: > > > This is to avoid a circular refcount problem between the kvm struct and > > > the device file. KVM modules holds device/group file reference when > the > > > device/group is added and releases it per removal or the last kvm > reference > > > is released. This reference model is ok for the group since there is no > > > kvm reference in the group paths. > > > > > > But it is a problem for device file since the vfio devices may get kvm > > > reference in the device open path and put it in the device file release. > > > e.g. Intel kvmgt. This would result in a circular issue since the kvm > > > side won't put the device file reference if kvm reference is not 0, while > > > the vfio device side needs to put kvm reference in the release callback. > > > > > > To solve this problem for device file, let vfio provide release() which > > > would be called once kvm file is closed, it won't depend on the last kvm > > > reference. Hence avoid circular refcount problem. > > > > > > Suggested-by: Kevin Tian <kevin.tian@intel.com> > > > Signed-off-by: Yi Liu <yi.l.liu@intel.com> > > > --- > > > virt/kvm/vfio.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c > > > index 0f54b9d308d7..525efe37ab6d 100644 > > > --- a/virt/kvm/vfio.c > > > +++ b/virt/kvm/vfio.c > > > @@ -364,7 +364,7 @@ static int kvm_vfio_create(struct kvm_device > *dev, > > u32 type); > > > static struct kvm_device_ops kvm_vfio_ops = { > > > .name = "kvm-vfio", > > > .create = kvm_vfio_create, > > > - .destroy = kvm_vfio_destroy, > > Is it safe to simply remove the destroy cb as it is called from > > kvm_destroy_vm/kvm_destroy_devices? > > Perhaps better to keep it. kvm_vfio_device is only one kind of kvm_device_type For kvm_vfio_device, it is now considered to better provide a release cb. While other kvm_device may better to have destroy cb. > > According to the definition .release is considered as an alternative > method to free the device: > > /* > * Destroy is responsible for freeing dev. > * > * Destroy may be called before or after destructors are called > * on emulated I/O regions, depending on whether a reference is > * held by a vcpu or other kvm component that gets destroyed > * after the emulated I/O. > */ > void (*destroy)(struct kvm_device *dev); > > /* > * Release is an alternative method to free the device. It is > * called when the device file descriptor is closed. Once > * release is called, the destroy method will not be called > * anymore as the device is removed from the device list of > * the VM. kvm->lock is held. > */ > void (*release)(struct kvm_device *dev); > > Did you see any specific problem of moving this stuff to release? It should only affect kvm_vfio_device itself.
> From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Friday, January 20, 2023 3:07 AM > > On Tue, Jan 17, 2023 at 05:49:34AM -0800, Yi Liu wrote: > > This is to avoid a circular refcount problem between the kvm struct and > > the device file. KVM modules holds device/group file reference when the > > device/group is added and releases it per removal or the last kvm > reference > > is released. This reference model is ok for the group since there is no > > kvm reference in the group paths. > > > > But it is a problem for device file since the vfio devices may get kvm > > reference in the device open path and put it in the device file release. > > e.g. Intel kvmgt. This would result in a circular issue since the kvm > > side won't put the device file reference if kvm reference is not 0, while > > the vfio device side needs to put kvm reference in the release callback. > > > > To solve this problem for device file, let vfio provide release() which > > would be called once kvm file is closed, it won't depend on the last kvm > > reference. Hence avoid circular refcount problem. > > > > Suggested-by: Kevin Tian <kevin.tian@intel.com> > > Signed-off-by: Yi Liu <yi.l.liu@intel.com> > > --- > > virt/kvm/vfio.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> > > From Alex's remarks please revise the commit message and add a Fixes > line of some kind that this solves the deadlock Matthew was working > on, and send it stand alone right away Sure.
> From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Friday, January 20, 2023 3:07 AM > > On Tue, Jan 17, 2023 at 05:49:34AM -0800, Yi Liu wrote: > > This is to avoid a circular refcount problem between the kvm struct and > > the device file. KVM modules holds device/group file reference when the > > device/group is added and releases it per removal or the last kvm > reference > > is released. This reference model is ok for the group since there is no > > kvm reference in the group paths. > > > > But it is a problem for device file since the vfio devices may get kvm > > reference in the device open path and put it in the device file release. > > e.g. Intel kvmgt. This would result in a circular issue since the kvm > > side won't put the device file reference if kvm reference is not 0, while > > the vfio device side needs to put kvm reference in the release callback. > > > > To solve this problem for device file, let vfio provide release() which > > would be called once kvm file is closed, it won't depend on the last kvm > > reference. Hence avoid circular refcount problem. > > > > Suggested-by: Kevin Tian <kevin.tian@intel.com> > > Signed-off-by: Yi Liu <yi.l.liu@intel.com> > > --- > > virt/kvm/vfio.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> > > From Alex's remarks please revise the commit message and add a Fixes > line of some kind that this solves the deadlock Matthew was working > on, and send it stand alone right away Hi Kevin, Jason, I got a minor question. Let me check your opinions. So after this change. Say we have thread A, which is the kvm-vfio device release. It will hold the kvm_lock and delete the kvm-vfio device from the kvm-device list. It will also call into vfio to set KVM==NULL. So it will try to hold group_lock. The locking order is kvm_lock -> group_lock. Say in the same time, we have thread B closes device, it will hold group_lock first and then calls kvm_put_kvm() which is the last reference, then it would loop the kvm-device list. Currently, it is not holding kvm_lock. But it also manipulating the kvm-device list, should it hold kvm_lock? If yes, then the locking order is group_lock -> kvm_lock. Then we will have A-B B-A lock issue. Regards, Yi Liu
On Fri, Jan 20, 2023 at 02:00:26PM +0000, Liu, Yi L wrote: > Say in the same time, we have thread B closes device, it will hold > group_lock first and then calls kvm_put_kvm() which is the last > reference, then it would loop the kvm-device list. Currently, it is > not holding kvm_lock. But it also manipulating the kvm-device list, > should it hold kvm_lock? No. When using refcounts if the refcount is 0 it guarantees there are no other threads that can possibly touch this memory, so any locks internal to the memory are not required. Jason
> From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Friday, January 20, 2023 10:34 PM > > On Fri, Jan 20, 2023 at 02:00:26PM +0000, Liu, Yi L wrote: > > Say in the same time, we have thread B closes device, it will hold > > group_lock first and then calls kvm_put_kvm() which is the last > > reference, then it would loop the kvm-device list. Currently, it is > > not holding kvm_lock. But it also manipulating the kvm-device list, > > should it hold kvm_lock? > > No. When using refcounts if the refcount is 0 it guarantees there are > no other threads that can possibly touch this memory, so any locks > internal to the memory are not required. Ok. The patch has been sent out standalone. https://lore.kernel.org/kvm/20230114000351.115444-1-mjrosato@linux.ibm.com/T/#u Regards, Yi Liu
> From: Liu, Yi L <yi.l.liu@intel.com> > Sent: Friday, January 20, 2023 11:10 PM > > > From: Jason Gunthorpe <jgg@nvidia.com> > > Sent: Friday, January 20, 2023 10:34 PM > > > > On Fri, Jan 20, 2023 at 02:00:26PM +0000, Liu, Yi L wrote: > > > Say in the same time, we have thread B closes device, it will hold > > > group_lock first and then calls kvm_put_kvm() which is the last > > > reference, then it would loop the kvm-device list. Currently, it is > > > not holding kvm_lock. But it also manipulating the kvm-device list, > > > should it hold kvm_lock? > > > > No. When using refcounts if the refcount is 0 it guarantees there are > > no other threads that can possibly touch this memory, so any locks > > internal to the memory are not required. > > Ok. The patch has been sent out standalone. > > https://lore.kernel.org/kvm/20230114000351.115444-1- > mjrosato@linux.ibm.com/T/#u Wrong link. Below is the correct one.
diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c index 0f54b9d308d7..525efe37ab6d 100644 --- a/virt/kvm/vfio.c +++ b/virt/kvm/vfio.c @@ -364,7 +364,7 @@ static int kvm_vfio_create(struct kvm_device *dev, u32 type); static struct kvm_device_ops kvm_vfio_ops = { .name = "kvm-vfio", .create = kvm_vfio_create, - .destroy = kvm_vfio_destroy, + .release = kvm_vfio_destroy, .set_attr = kvm_vfio_set_attr, .has_attr = kvm_vfio_has_attr, };
This is to avoid a circular refcount problem between the kvm struct and the device file. KVM modules holds device/group file reference when the device/group is added and releases it per removal or the last kvm reference is released. This reference model is ok for the group since there is no kvm reference in the group paths. But it is a problem for device file since the vfio devices may get kvm reference in the device open path and put it in the device file release. e.g. Intel kvmgt. This would result in a circular issue since the kvm side won't put the device file reference if kvm reference is not 0, while the vfio device side needs to put kvm reference in the release callback. To solve this problem for device file, let vfio provide release() which would be called once kvm file is closed, it won't depend on the last kvm reference. Hence avoid circular refcount problem. Suggested-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> --- virt/kvm/vfio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)