Message ID | 5-v2-6a528653a750+1578a-vfio_kvm_no_group_jgg@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Remove vfio_group from the struct file facing VFIO API | expand |
> From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Thursday, April 21, 2022 3:23 AM > > Instead of a general extension check change the function into a limited > test if the iommu_domain has enforced coherency, which is the only thing > kvm needs to query. > > Make the new op self contained by properly refcounting the container > before touching it. > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> > --- > drivers/vfio/vfio.c | 30 +++++++++++++++++++++++++++--- > include/linux/vfio.h | 3 +-- > virt/kvm/vfio.c | 16 ++++++++-------- > 3 files changed, 36 insertions(+), 13 deletions(-) > > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c > index c9122c84583aa1..ae3e802991edf2 100644 > --- a/drivers/vfio/vfio.c > +++ b/drivers/vfio/vfio.c > @@ -2005,11 +2005,35 @@ struct iommu_group > *vfio_file_iommu_group(struct file *file) > } > EXPORT_SYMBOL_GPL(vfio_file_iommu_group); > > -long vfio_external_check_extension(struct vfio_group *group, unsigned long > arg) > +/** > + * vfio_file_enforced_coherent - True if the DMA associated with the VFIO > file > + * is always CPU cache coherent > + * @file: VFIO group file > + * > + * Enforced coherency means that the IOMMU ignores things like the PCIe > no-snoop > + * bit in DMA transactions. A return of false indicates that the user has > + * rights to access additional instructions such as wbinvd on x86. > + */ > +bool vfio_file_enforced_coherent(struct file *file) > { > - return vfio_ioctl_check_extension(group->container, arg); > + struct vfio_group *group = file->private_data; > + bool ret; > + > + if (file->f_op != &vfio_group_fops) > + return true; > + > + /* > + * Since the coherency state is determined only once a container is > + * attached the user must do so before they can prove they have > + * permission. > + */ > + if (vfio_group_add_container_user(group)) > + return true; IMHO I still think returning an error here is better for 'user must do so' than telling inaccurate info and leading to a situation where lacking of wbinvd may incur various cache problem which is hard to debug. Yes, it's user's own problem but having a place to capture this problem early is still a nice thing to do. > + ret = vfio_ioctl_check_extension(group->container, > VFIO_DMA_CC_IOMMU); > + vfio_group_try_dissolve_container(group); > + return ret; > } > -EXPORT_SYMBOL_GPL(vfio_external_check_extension); > +EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent); > > /* > * Sub-module support > diff --git a/include/linux/vfio.h b/include/linux/vfio.h > index 132cf3e7cda8db..7f022ae126a392 100644 > --- a/include/linux/vfio.h > +++ b/include/linux/vfio.h > @@ -143,8 +143,7 @@ extern void vfio_group_put_external_user(struct > vfio_group *group); > extern struct vfio_group *vfio_group_get_external_user_from_dev(struct > device > *dev); > extern struct iommu_group *vfio_file_iommu_group(struct file *file); > -extern long vfio_external_check_extension(struct vfio_group *group, > - unsigned long arg); > +extern bool vfio_file_enforced_coherent(struct file *file); > > #define VFIO_PIN_PAGES_MAX_ENTRIES (PAGE_SIZE/sizeof(unsigned > long)) > > diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c > index 50193ae270faca..2330b0c272e671 100644 > --- a/virt/kvm/vfio.c > +++ b/virt/kvm/vfio.c > @@ -75,20 +75,20 @@ static void kvm_vfio_group_set_kvm(struct > vfio_group *group, struct kvm *kvm) > symbol_put(vfio_group_set_kvm); > } > > -static bool kvm_vfio_group_is_coherent(struct vfio_group *vfio_group) > +static bool kvm_vfio_file_enforced_coherent(struct file *file) > { > - long (*fn)(struct vfio_group *, unsigned long); > - long ret; > + bool (*fn)(struct file *file); > + bool ret; > > - fn = symbol_get(vfio_external_check_extension); > + fn = symbol_get(vfio_file_enforced_coherent); > if (!fn) > return false; > > - ret = fn(vfio_group, VFIO_DMA_CC_IOMMU); > + ret = fn(file); > > - symbol_put(vfio_external_check_extension); > + symbol_put(vfio_file_enforced_coherent); > > - return ret > 0; > + return ret; > } > > #ifdef CONFIG_SPAPR_TCE_IOMMU > @@ -136,7 +136,7 @@ static void kvm_vfio_update_coherency(struct > kvm_device *dev) > mutex_lock(&kv->lock); > > list_for_each_entry(kvg, &kv->group_list, node) { > - if (!kvm_vfio_group_is_coherent(kvg->vfio_group)) { > + if (!kvm_vfio_file_enforced_coherent(kvg->file)) { > noncoherent = true; > break; > } > -- > 2.36.0
I can see why a specific error might be nice for some cases and am
open to that, but as a simple transformation this already looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
> From: Christoph Hellwig <hch@lst.de> > Sent: Thursday, April 21, 2022 1:41 PM > > I can see why a specific error might be nice for some cases and am > open to that, but as a simple transformation this already looks good: > There is a slight semantics change together with patch7. Before patch7 the container must be attached before calling KVM_DEV_VFIO_GROUP_ADD, otherwise vfio_group_get_external_user() will fail. In this case the result of cache coherency for a group is deterministic, either true or false. After patch7 vfio_group_get_external_user() is not called. It's possible that KVM_DEV_VFIO_GROUP_ADD is called before a container is attached by the group. In this case cache coherency of the group cannot be determined at that point. I prefer to returning an error in this callback so KVM can still fail adding the group, instead of letting the inaccurate coherency info to bit the user in a much latter point... Thanks Kevin
> From: Tian, Kevin > Sent: Friday, April 22, 2022 8:13 AM > > > From: Christoph Hellwig <hch@lst.de> > > Sent: Thursday, April 21, 2022 1:41 PM > > > > I can see why a specific error might be nice for some cases and am > > open to that, but as a simple transformation this already looks good: > > > > There is a slight semantics change together with patch7. > > Before patch7 the container must be attached before calling > KVM_DEV_VFIO_GROUP_ADD, otherwise vfio_group_get_external_user() > will fail. In this case the result of cache coherency for a group is > deterministic, either true or false. > > After patch7 vfio_group_get_external_user() is not called. It's possible > that KVM_DEV_VFIO_GROUP_ADD is called before a container is attached > by the group. In this case cache coherency of the group cannot be > determined at that point. I prefer to returning an error in this callback > so KVM can still fail adding the group, instead of letting the inaccurate > coherency info to bit the user in a much latter point... > More accurately: s/cache coherency/enforced coherency/ in above.
On Fri, Apr 22, 2022 at 12:32:58AM +0000, Tian, Kevin wrote: > > From: Tian, Kevin > > Sent: Friday, April 22, 2022 8:13 AM > > > > > From: Christoph Hellwig <hch@lst.de> > > > Sent: Thursday, April 21, 2022 1:41 PM > > > > > > I can see why a specific error might be nice for some cases and am > > > open to that, but as a simple transformation this already looks good: > > > > > > > There is a slight semantics change together with patch7. > > > > Before patch7 the container must be attached before calling > > KVM_DEV_VFIO_GROUP_ADD, otherwise vfio_group_get_external_user() > > will fail. In this case the result of cache coherency for a group is > > deterministic, either true or false. No, it isn't. The coherency is a propery of the iommu_domain/container and it can change when more groups are attached to the same domain. The best KVM can say is that if it is reporting coherency enforced it won't stop reporting that - which doesn't change in this series. > > After patch7 vfio_group_get_external_user() is not called. It's possible > > that KVM_DEV_VFIO_GROUP_ADD is called before a container is attached > > by the group. In this case cache coherency of the group cannot be > > determined at that point. groups don't have cache coherency, only iommu_domains do. In this case it is correct to report that no non-coherent DMA is possible becuase it isn't possible at that instant when no domain is attached. > I prefer to returning an error in this callback > so KVM can still > fail adding the group, instead of letting the inaccurate > coherency > info to bit the user in a much latter point... > > More accurately: s/cache coherency/enforced coherency/ in above. An error here will not cause KVM to fail adding the group without more changes to how it works. I'll check, it would be nice to preserve the ABI behavior of rejecting groups with no container. Jason
> From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Saturday, April 23, 2022 12:59 AM > > On Fri, Apr 22, 2022 at 12:32:58AM +0000, Tian, Kevin wrote: > > > From: Tian, Kevin > > > Sent: Friday, April 22, 2022 8:13 AM > > > > > > > From: Christoph Hellwig <hch@lst.de> > > > > Sent: Thursday, April 21, 2022 1:41 PM > > > > > > > > I can see why a specific error might be nice for some cases and am > > > > open to that, but as a simple transformation this already looks good: > > > > > > > > > > There is a slight semantics change together with patch7. > > > > > > Before patch7 the container must be attached before calling > > > KVM_DEV_VFIO_GROUP_ADD, otherwise vfio_group_get_external_user() > > > will fail. In this case the result of cache coherency for a group is > > > deterministic, either true or false. > > No, it isn't. The coherency is a propery of the iommu_domain/container > and it can change when more groups are attached to the same > domain. The best KVM can say is that if it is reporting coherency > enforced it won't stop reporting that - which doesn't change in this > series. You are right. I thought this interface returns the domain property which should be locked down once being attached by a group but it actually checks the container property which is dynamic as you pointed out.
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index c9122c84583aa1..ae3e802991edf2 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -2005,11 +2005,35 @@ struct iommu_group *vfio_file_iommu_group(struct file *file) } EXPORT_SYMBOL_GPL(vfio_file_iommu_group); -long vfio_external_check_extension(struct vfio_group *group, unsigned long arg) +/** + * vfio_file_enforced_coherent - True if the DMA associated with the VFIO file + * is always CPU cache coherent + * @file: VFIO group file + * + * Enforced coherency means that the IOMMU ignores things like the PCIe no-snoop + * bit in DMA transactions. A return of false indicates that the user has + * rights to access additional instructions such as wbinvd on x86. + */ +bool vfio_file_enforced_coherent(struct file *file) { - return vfio_ioctl_check_extension(group->container, arg); + struct vfio_group *group = file->private_data; + bool ret; + + if (file->f_op != &vfio_group_fops) + return true; + + /* + * Since the coherency state is determined only once a container is + * attached the user must do so before they can prove they have + * permission. + */ + if (vfio_group_add_container_user(group)) + return true; + ret = vfio_ioctl_check_extension(group->container, VFIO_DMA_CC_IOMMU); + vfio_group_try_dissolve_container(group); + return ret; } -EXPORT_SYMBOL_GPL(vfio_external_check_extension); +EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent); /* * Sub-module support diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 132cf3e7cda8db..7f022ae126a392 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -143,8 +143,7 @@ extern void vfio_group_put_external_user(struct vfio_group *group); extern struct vfio_group *vfio_group_get_external_user_from_dev(struct device *dev); extern struct iommu_group *vfio_file_iommu_group(struct file *file); -extern long vfio_external_check_extension(struct vfio_group *group, - unsigned long arg); +extern bool vfio_file_enforced_coherent(struct file *file); #define VFIO_PIN_PAGES_MAX_ENTRIES (PAGE_SIZE/sizeof(unsigned long)) diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c index 50193ae270faca..2330b0c272e671 100644 --- a/virt/kvm/vfio.c +++ b/virt/kvm/vfio.c @@ -75,20 +75,20 @@ static void kvm_vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm) symbol_put(vfio_group_set_kvm); } -static bool kvm_vfio_group_is_coherent(struct vfio_group *vfio_group) +static bool kvm_vfio_file_enforced_coherent(struct file *file) { - long (*fn)(struct vfio_group *, unsigned long); - long ret; + bool (*fn)(struct file *file); + bool ret; - fn = symbol_get(vfio_external_check_extension); + fn = symbol_get(vfio_file_enforced_coherent); if (!fn) return false; - ret = fn(vfio_group, VFIO_DMA_CC_IOMMU); + ret = fn(file); - symbol_put(vfio_external_check_extension); + symbol_put(vfio_file_enforced_coherent); - return ret > 0; + return ret; } #ifdef CONFIG_SPAPR_TCE_IOMMU @@ -136,7 +136,7 @@ static void kvm_vfio_update_coherency(struct kvm_device *dev) mutex_lock(&kv->lock); list_for_each_entry(kvg, &kv->group_list, node) { - if (!kvm_vfio_group_is_coherent(kvg->vfio_group)) { + if (!kvm_vfio_file_enforced_coherent(kvg->file)) { noncoherent = true; break; }
Instead of a general extension check change the function into a limited test if the iommu_domain has enforced coherency, which is the only thing kvm needs to query. Make the new op self contained by properly refcounting the container before touching it. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> --- drivers/vfio/vfio.c | 30 +++++++++++++++++++++++++++--- include/linux/vfio.h | 3 +-- virt/kvm/vfio.c | 16 ++++++++-------- 3 files changed, 36 insertions(+), 13 deletions(-)