Message ID | 20190716101657.23327-4-paul.durrant@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | iommu groups + cleanup | expand |
On 16/07/2019 11:16, Paul Durrant wrote: > +int iommu_group_assign(struct pci_dev *pdev, void *arg) > +{ > + const struct iommu_ops *ops = iommu_get_ops(); > + int id; > + struct iommu_group *grp; > + > + if ( !ops->get_device_group_id ) > + return 0; > + > + id = ops->get_device_group_id(pdev->seg, pdev->bus, pdev->devfn); > + if ( id < 0 ) > + return -ENODATA; > + > + grp = get_iommu_group(id); > + if ( !grp ) > + return -ENOMEM; > + > + if ( iommu_verbose ) > + printk(XENLOG_INFO "Assign %04x:%02x:%02x.%u -> IOMMU group %x\n", No unadorned hex numbers please. This is a recipe for confusion during debugging. Either %#x, or %u, and needs to be fixed on commit if we go with that route. ~Andrew
On 16/07/2019 11:31, Paul Durrant wrote: >> -----Original Message----- >> From: Andrew Cooper <Andrew.Cooper3@citrix.com> >> Sent: 16 July 2019 11:26 >> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org >> Cc: Jan Beulich <jbeulich@suse.com>; George Dunlap <George.Dunlap@citrix.com>; Ian Jackson >> <Ian.Jackson@citrix.com>; Julien Grall <julien.grall@arm.com>; Konrad Rzeszutek Wilk >> <konrad.wilk@oracle.com>; Stefano Stabellini <sstabellini@kernel.org>; Tim (Xen.org) <tim@xen.org>; >> Wei Liu <wl@xen.org> >> Subject: Re: [PATCH v3 3/4] iommu: introduce iommu_groups >> >> On 16/07/2019 11:16, Paul Durrant wrote: >>> +int iommu_group_assign(struct pci_dev *pdev, void *arg) >>> +{ >>> + const struct iommu_ops *ops = iommu_get_ops(); >>> + int id; >>> + struct iommu_group *grp; >>> + >>> + if ( !ops->get_device_group_id ) >>> + return 0; >>> + >>> + id = ops->get_device_group_id(pdev->seg, pdev->bus, pdev->devfn); >>> + if ( id < 0 ) >>> + return -ENODATA; >>> + >>> + grp = get_iommu_group(id); >>> + if ( !grp ) >>> + return -ENOMEM; >>> + >>> + if ( iommu_verbose ) >>> + printk(XENLOG_INFO "Assign %04x:%02x:%02x.%u -> IOMMU group %x\n", >> No unadorned hex numbers please. This is a recipe for confusion during >> debugging. >> >> Either %#x, or %u, and needs to be fixed on commit if we go with that route. > Personally I prefer the abstract iommu group index, more like Linux. I'd be happy to re-instate that and use %u. As for the %x here, I view it the same as the unadorned %x for seg, bus, and dev... but I'm not really fussy if you want to add a '#'. I don't mind if it is rendered in hex or dec, but it must not be ambiguous when printed. Otherwise trying to debug issues to do with IOMMU group 12 is going to be a lesson in pain and misery. PCI coordinates are not ambiguous, even when they lack the 0x prefix. Plain numbers are. ~Andrew
On 16.07.2019 12:26, Andrew Cooper wrote: > On 16/07/2019 11:16, Paul Durrant wrote: >> +int iommu_group_assign(struct pci_dev *pdev, void *arg) >> +{ >> + const struct iommu_ops *ops = iommu_get_ops(); >> + int id; >> + struct iommu_group *grp; >> + >> + if ( !ops->get_device_group_id ) >> + return 0; >> + >> + id = ops->get_device_group_id(pdev->seg, pdev->bus, pdev->devfn); >> + if ( id < 0 ) >> + return -ENODATA; >> + >> + grp = get_iommu_group(id); >> + if ( !grp ) >> + return -ENOMEM; >> + >> + if ( iommu_verbose ) >> + printk(XENLOG_INFO "Assign %04x:%02x:%02x.%u -> IOMMU group %x\n", > > No unadorned hex numbers please. This is a recipe for confusion during > debugging. > > Either %#x, or %u, and needs to be fixed on commit if we go with that route. I assume (hope) that you mean this for the right most number only; it's not entirely unambiguous from your reply. Jan
On 16/07/2019 11:37, Jan Beulich wrote: > On 16.07.2019 12:26, Andrew Cooper wrote: >> On 16/07/2019 11:16, Paul Durrant wrote: >>> +int iommu_group_assign(struct pci_dev *pdev, void *arg) >>> +{ >>> + const struct iommu_ops *ops = iommu_get_ops(); >>> + int id; >>> + struct iommu_group *grp; >>> + >>> + if ( !ops->get_device_group_id ) >>> + return 0; >>> + >>> + id = ops->get_device_group_id(pdev->seg, pdev->bus, pdev->devfn); >>> + if ( id < 0 ) >>> + return -ENODATA; >>> + >>> + grp = get_iommu_group(id); >>> + if ( !grp ) >>> + return -ENOMEM; >>> + >>> + if ( iommu_verbose ) >>> + printk(XENLOG_INFO "Assign %04x:%02x:%02x.%u -> IOMMU group %x\n", >> No unadorned hex numbers please. This is a recipe for confusion during >> debugging. >> >> Either %#x, or %u, and needs to be fixed on commit if we go with that route. > I assume (hope) that you mean this for the right most number only; it's not > entirely unambiguous from your reply. Oops yes - I only meant the IOMMU group formatter. The PCI coordinates should get subsumed by %pp in fairly short order. ~Andrew
On 16.07.2019 12:41, Andrew Cooper wrote: > On 16/07/2019 11:37, Jan Beulich wrote: >> On 16.07.2019 12:26, Andrew Cooper wrote: >>> On 16/07/2019 11:16, Paul Durrant wrote: >>>> +int iommu_group_assign(struct pci_dev *pdev, void *arg) >>>> +{ >>>> + const struct iommu_ops *ops = iommu_get_ops(); >>>> + int id; >>>> + struct iommu_group *grp; >>>> + >>>> + if ( !ops->get_device_group_id ) >>>> + return 0; >>>> + >>>> + id = ops->get_device_group_id(pdev->seg, pdev->bus, pdev->devfn); >>>> + if ( id < 0 ) >>>> + return -ENODATA; >>>> + >>>> + grp = get_iommu_group(id); >>>> + if ( !grp ) >>>> + return -ENOMEM; >>>> + >>>> + if ( iommu_verbose ) >>>> + printk(XENLOG_INFO "Assign %04x:%02x:%02x.%u -> IOMMU group %x\n", >>> No unadorned hex numbers please. This is a recipe for confusion during >>> debugging. >>> >>> Either %#x, or %u, and needs to be fixed on commit if we go with that route. >> I assume (hope) that you mean this for the right most number only; it's not >> entirely unambiguous from your reply. > > Oops yes - I only meant the IOMMU group formatter. > > The PCI coordinates should get subsumed by %pp in fairly short order. %op I (still) hope ;-) Jan
On 16.07.2019 12:16, Paul Durrant wrote: > --- a/xen/drivers/passthrough/Makefile > +++ b/xen/drivers/passthrough/Makefile > @@ -4,6 +4,7 @@ subdir-$(CONFIG_X86) += x86 > subdir-$(CONFIG_ARM) += arm > > obj-y += iommu.o > +obj-$(CONFIG_HAS_PCI) += groups.o I assume this dependency on PCI is temporary, as there's nothing inherently tying grouping of devices to PCI (afaict)? > +int iommu_group_assign(struct pci_dev *pdev, void *arg) > +{ > + const struct iommu_ops *ops = iommu_get_ops(); > + int id; > + struct iommu_group *grp; > + > + if ( !ops->get_device_group_id ) > + return 0; With you making groups mandatory (i.e. even solitary devices getting put in a group), shouldn't this be -EOPNOTSUPP, maybe accompanied by ASSERT_UNREACHABLE()? > + id = ops->get_device_group_id(pdev->seg, pdev->bus, pdev->devfn); > + if ( id < 0 ) > + return -ENODATA; > + > + grp = get_iommu_group(id); > + if ( !grp ) > + return -ENOMEM; > + > + if ( iommu_verbose ) > + printk(XENLOG_INFO "Assign %04x:%02x:%02x.%u -> IOMMU group %x\n", > + pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn), > + PCI_FUNC(pdev->devfn), grp->id); I'm not overly happy about this new logging: On modern systems a debug level run is already rather verbose about PCI devices, simply because there are so many. If my hope to not see individual devices put in groups is not going to be fulfilled, can we at least try to come to some agreement that certain devices which can't sensibly be passed through won't be assigned groups (and hence won't produce output here)? A group-less device then would automatically be unable to have its owner changed. Jan
diff --git a/xen/drivers/passthrough/Makefile b/xen/drivers/passthrough/Makefile index d50ab188c8..8a77110179 100644 --- a/xen/drivers/passthrough/Makefile +++ b/xen/drivers/passthrough/Makefile @@ -4,6 +4,7 @@ subdir-$(CONFIG_X86) += x86 subdir-$(CONFIG_ARM) += arm obj-y += iommu.o +obj-$(CONFIG_HAS_PCI) += groups.o obj-$(CONFIG_HAS_PCI) += pci.o obj-$(CONFIG_HAS_DEVICE_TREE) += device_tree.o diff --git a/xen/drivers/passthrough/groups.c b/xen/drivers/passthrough/groups.c new file mode 100644 index 0000000000..c6d00980b6 --- /dev/null +++ b/xen/drivers/passthrough/groups.c @@ -0,0 +1,91 @@ +/* + * Copyright (c) 2019 Citrix Systems Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include <xen/iommu.h> +#include <xen/radix-tree.h> + +struct iommu_group { + unsigned int id; +}; + +static struct radix_tree_root iommu_groups; + +void __init iommu_groups_init(void) +{ + radix_tree_init(&iommu_groups); +} + +static struct iommu_group *alloc_iommu_group(unsigned int id) +{ + struct iommu_group *grp = xzalloc(struct iommu_group); + + if ( !grp ) + return NULL; + + grp->id = id; + + if ( radix_tree_insert(&iommu_groups, id, grp) ) + { + xfree(grp); + grp = NULL; + } + + return grp; +} + +static struct iommu_group *get_iommu_group(unsigned int id) +{ + struct iommu_group *grp = radix_tree_lookup(&iommu_groups, id); + + if ( !grp ) + grp = alloc_iommu_group(id); + + return grp; +} + +int iommu_group_assign(struct pci_dev *pdev, void *arg) +{ + const struct iommu_ops *ops = iommu_get_ops(); + int id; + struct iommu_group *grp; + + if ( !ops->get_device_group_id ) + return 0; + + id = ops->get_device_group_id(pdev->seg, pdev->bus, pdev->devfn); + if ( id < 0 ) + return -ENODATA; + + grp = get_iommu_group(id); + if ( !grp ) + return -ENOMEM; + + if ( iommu_verbose ) + printk(XENLOG_INFO "Assign %04x:%02x:%02x.%u -> IOMMU group %x\n", + pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn), + PCI_FUNC(pdev->devfn), grp->id); + + pdev->grp = grp; + + return 0; +} + +/* + * Local variables: + * mode: C + * c-file-style: "BSD" + * c-basic-offset: 4 + * indent-tabs-mode: nil + * End: + */ diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c index a7438c9c25..90fc750456 100644 --- a/xen/drivers/passthrough/x86/iommu.c +++ b/xen/drivers/passthrough/x86/iommu.c @@ -43,7 +43,13 @@ int __init iommu_hardware_setup(void) /* x2apic setup may have previously initialised the struct. */ ASSERT(iommu_ops.init == iommu_init_ops->ops->init); - return iommu_init_ops->setup(); + rc = iommu_init_ops->setup(); + if ( rc ) + return rc; + + iommu_groups_init(); + + return pci_pdevs_iterate(iommu_group_assign, NULL); } int iommu_enable_x2apic(void) diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h index 48f87480a7..c93f580fdc 100644 --- a/xen/include/xen/iommu.h +++ b/xen/include/xen/iommu.h @@ -317,6 +317,13 @@ DECLARE_PER_CPU(bool_t, iommu_dont_flush_iotlb); extern struct spinlock iommu_pt_cleanup_lock; extern struct page_list_head iommu_pt_cleanup_list; +#ifdef CONFIG_HAS_PCI + +void iommu_groups_init(void); +int iommu_group_assign(struct pci_dev *pdev, void *arg); + +#endif /* CONFIG_HAS_PCI */ + #endif /* _IOMMU_H_ */ /* diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h index 79eb25417b..e1f887af1c 100644 --- a/xen/include/xen/pci.h +++ b/xen/include/xen/pci.h @@ -79,6 +79,8 @@ struct pci_dev { struct list_head alldevs_list; struct list_head domain_list; + struct iommu_group *grp; + struct list_head msi_list; struct arch_msix *msix;