Message ID | 20160224194406.7585.17447.stgit@bhelgaas-glaptop2.roam.corp.google.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Wed, Feb 24, 2016 at 01:44:06PM -0600, Bjorn Helgaas wrote: > From: Jacek Lawrynowicz <jacek.lawrynowicz@intel.com> > > <Insert changelog here> (Sorry, I should have copied this changelog in the patch; I copied this manually from your v3 posting): > This patch solves IOMMU support issues with PCIe non-transparent bridges > that use Requester ID look-up tables (LUT), e.g. PEX8733. Before exiting > the bridge, packet's RID is rewritten according to LUT programmed by > a driver. Modified packets are then passed to a destination bus and > processed upstream. The problem is that such packets seem to come from > non-existent nodes that are hidden behind NTB and are not discoverable > by a destination node, so IOMMU discards them. Adding DMA alias for a > given LUT entry allows IOMMU to create a proper mapping that enables > inter-node communication. A specific example here would help me understand. Here's how I understand this (correct me if I'm wrong): We're talking about a DMA packet being forwarded upstream from an NTB. The NTB uses the LUT to rewrite the RID in the DMA packet. The new RID from the LUT is unknown to the IOMMU, so it discards the DMA packet. > The current DMA alias implementation supports only single alias, so it's > not possible to connect more than two nodes when IOMMU is enabled. This > implementation enables all possible aliases on a given bus (256) that > are stored in a bitset. Alias devfn is directly translated to a bit > number. The bitset is not allocated for devices that have no need for > DMA aliases. I think "two nodes" is referring to two PCIe devices on the other side of the NTB. You want DMA packets from those devices to have different RIDs so the IOMMU can distinguish them. The LUT entries basically create aliases of the NTB (one alias for each device beyond the NTB). Your quirk uses pci_add_dma_alias(), and the aliases are all on the same bus as the NTB itself. The quirk adds PCI_DEVFN(0x10, 0x0), PCI_DEVFN(0x11, 0x0), and PCI_DEVFN(0x12, 0x0). Shouldn't there be some connection between this and the LUT programming? I assume the LUT is programmed to correspond to those aliases. Does this mean you're limited to three devices beyond the NTB? > --- > drivers/iommu/iommu.c | 17 ++++++++++------- > drivers/pci/pci.c | 11 +++++++++-- > drivers/pci/probe.c | 1 + > drivers/pci/search.c | 14 +++++++++----- > include/linux/pci.h | 4 +--- > 5 files changed, 30 insertions(+), 17 deletions(-) > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > index 0e3b009..a214e19 100644 > --- a/drivers/iommu/iommu.c > +++ b/drivers/iommu/iommu.c > @@ -659,9 +659,15 @@ static struct iommu_group *get_pci_function_alias_group(struct pci_dev *pdev, > return NULL; > } > > +static bool dma_alias_is_enabled(struct pci_dev *dev, u8 devfn) > +{ > + return dev->dma_alias_mask && > + test_bit(devfn, dev->dma_alias_mask); > +} > + > /* > - * Look for aliases to or from the given device for exisiting groups. The > - * dma_alias_devfn only supports aliases on the same bus, therefore the search > + * Look for aliases to or from the given device for existing groups. DMA > + * aliases are only supported on the same bus, therefore the search I'm trying to reconcile this statement that "DMA aliases are only supported on the same bus" (which was there even before this patch) with the fact that pci_for_each_dma_alias() does not have that limitation. > * space is quite small (especially since we're really only looking at pcie > * device, and therefore only expect multiple slots on the root complex or > * downstream switch ports). It's conceivable though that a pair of > @@ -686,11 +692,8 @@ static struct iommu_group *get_pci_alias_group(struct pci_dev *pdev, > continue; > > /* We alias them or they alias us */ > - if (((pdev->dev_flags & PCI_DEV_FLAGS_DMA_ALIAS_DEVFN) && > - pdev->dma_alias_devfn == tmp->devfn) || > - ((tmp->dev_flags & PCI_DEV_FLAGS_DMA_ALIAS_DEVFN) && > - tmp->dma_alias_devfn == pdev->devfn)) { > - > + if (dma_alias_is_enabled(pdev, tmp->devfn) || > + dma_alias_is_enabled(tmp, pdev->devfn)) { > group = get_pci_alias_group(tmp, devfns); We basically have this: for_each_pci_dev(tmp) { if (<pdev and tmp are DMA aliases>) group = get_pci_alias_group(); ... } The DMA alias stuff relies on PCI internals, so it doesn't doesn't seem quite right to use things like PCI_DEV_FLAGS_DMA_ALIAS_DEVFN and dma_alias_devfn here in the IOMMU code. I'm trying to figure out why we don't do something like the following instead: callback(struct pci_dev *pdev, u16 alias, void *opaque) { struct iommu_group *group; group = get_pci_alias_group(); if (group) return group; return 0; } pci_for_each_dma_alias(pdev, callback, ...); Is the existing code some sort of optimization, e.g., checking PCI_DEV_FLAGS_DMA_ALIAS_DEVFN is cheaper than using pci_for_each_dma_alias()? It seems like this won't work for some very unlikely but theoretically possible topologies, e.g., PCIe Root Complex/IOMMU PCIe switch A PCIe to conventional PCI bridge PCI to PCIe Root Complex PCIe NTB Here, I think the IOMMU will only see RIDs from PCIe switch A, but the current code only looks at DMA aliases that are on the same bus as the PCIe NTB. Wouldn't using pci_for_each_dma_alias() handle this correctly? > if (group) { > pci_dev_put(tmp); -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> -----Original Message----- > From: Bjorn Helgaas [mailto:helgaas@kernel.org] > Sent: Thursday, February 25, 2016 3:39 PM > To: Bjorn Helgaas <bhelgaas@google.com> > Cc: Lawrynowicz, Jacek <jacek.lawrynowicz@intel.com>; linux- > pci@vger.kernel.org; Alex Williamson <alex.williamson@redhat.com>; Joerg > Roedel <jroedel@suse.de>; David Woodhouse <dwmw2@infradead.org>; > iommu@lists.linux-foundation.org > Subject: Re: [PATCH v4 3/6] PCI: Add support for multiple DMA aliases > > On Wed, Feb 24, 2016 at 01:44:06PM -0600, Bjorn Helgaas wrote: > > From: Jacek Lawrynowicz <jacek.lawrynowicz@intel.com> > > > > <Insert changelog here> > > (Sorry, I should have copied this changelog in the patch; I copied > this manually from your v3 posting): > > > This patch solves IOMMU support issues with PCIe non-transparent bridges > > that use Requester ID look-up tables (LUT), e.g. PEX8733. Before exiting > > the bridge, packet's RID is rewritten according to LUT programmed by > > a driver. Modified packets are then passed to a destination bus and > > processed upstream. The problem is that such packets seem to come from > > non-existent nodes that are hidden behind NTB and are not discoverable > > by a destination node, so IOMMU discards them. Adding DMA alias for a > > given LUT entry allows IOMMU to create a proper mapping that enables > > inter-node communication. > > A specific example here would help me understand. Here's how I > understand this (correct me if I'm wrong): We're talking about a DMA > packet being forwarded upstream from an NTB. The NTB uses the LUT to > rewrite the RID in the DMA packet. The new RID from the LUT is > unknown to the IOMMU, so it discards the DMA packet. Yes, this is exactly the problem. > > The current DMA alias implementation supports only single alias, so it's > > not possible to connect more than two nodes when IOMMU is enabled. This > > implementation enables all possible aliases on a given bus (256) that > > are stored in a bitset. Alias devfn is directly translated to a bit > > number. The bitset is not allocated for devices that have no need for > > DMA aliases. > > I think "two nodes" is referring to two PCIe devices on the other side > of the NTB. You want DMA packets from those devices to have different > RIDs so the IOMMU can distinguish them. Right. > The LUT entries basically create aliases of the NTB (one alias for > each device beyond the NTB). Your quirk uses pci_add_dma_alias(), and > the aliases are all on the same bus as the NTB itself. > > The quirk adds PCI_DEVFN(0x10, 0x0), PCI_DEVFN(0x11, 0x0), and > PCI_DEVFN(0x12, 0x0). Shouldn't there be some connection between this > and the LUT programming? I assume the LUT is programmed to correspond > to those aliases. Does this mean you're limited to three devices > beyond the NTB? Yes, there is an indirect connection between LUT table and devfns used in the quirk. Dev part is an offset in the LUT table and function is taken from the device behind the NTB. So the driver can only change the dev part by using different LUT offsets. We don't plan to modify this quirk. The number of PCIe devices beyond single x200 card NTB will not change. Two are used by x200 CPU (host bridge & root port) and one is used by x200 DMA engine. I'm not sure introducing some dependencies to make sure the offsets are set correctly is really worth it. So regarding the improvements in the patch description, you want me to update and repost it? BTW I posted x200 DMA driver (the client for this change) on DMA list: https://lkml.org/lkml/2016/2/9/287 I'm working on integrating review comments and hope to get it included in 4.6. Regards, Jacek > > --- > > drivers/iommu/iommu.c | 17 ++++++++++------- > > drivers/pci/pci.c | 11 +++++++++-- > > drivers/pci/probe.c | 1 + > > drivers/pci/search.c | 14 +++++++++----- > > include/linux/pci.h | 4 +--- > > 5 files changed, 30 insertions(+), 17 deletions(-) > > > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > > index 0e3b009..a214e19 100644 > > --- a/drivers/iommu/iommu.c > > +++ b/drivers/iommu/iommu.c > > @@ -659,9 +659,15 @@ static struct iommu_group > *get_pci_function_alias_group(struct pci_dev *pdev, > > return NULL; > > } > > > > +static bool dma_alias_is_enabled(struct pci_dev *dev, u8 devfn) > > +{ > > + return dev->dma_alias_mask && > > + test_bit(devfn, dev->dma_alias_mask); > > +} > > + > > /* > > - * Look for aliases to or from the given device for exisiting groups. The > > - * dma_alias_devfn only supports aliases on the same bus, therefore the > search > > + * Look for aliases to or from the given device for existing groups. DMA > > + * aliases are only supported on the same bus, therefore the search > > I'm trying to reconcile this statement that "DMA aliases are only > supported on the same bus" (which was there even before this patch) > with the fact that pci_for_each_dma_alias() does not have that > limitation. > > > * space is quite small (especially since we're really only looking at pcie > > * device, and therefore only expect multiple slots on the root complex or > > * downstream switch ports). It's conceivable though that a pair of > > @@ -686,11 +692,8 @@ static struct iommu_group *get_pci_alias_group(struct > pci_dev *pdev, > > continue; > > > > /* We alias them or they alias us */ > > - if (((pdev->dev_flags & PCI_DEV_FLAGS_DMA_ALIAS_DEVFN) > && > > - pdev->dma_alias_devfn == tmp->devfn) || > > - ((tmp->dev_flags & PCI_DEV_FLAGS_DMA_ALIAS_DEVFN) && > > - tmp->dma_alias_devfn == pdev->devfn)) { > > - > > + if (dma_alias_is_enabled(pdev, tmp->devfn) || > > + dma_alias_is_enabled(tmp, pdev->devfn)) { > > group = get_pci_alias_group(tmp, devfns); > > We basically have this: > > for_each_pci_dev(tmp) { > if (<pdev and tmp are DMA aliases>) > group = get_pci_alias_group(); > ... > } > > The DMA alias stuff relies on PCI internals, so it doesn't doesn't > seem quite right to use things like PCI_DEV_FLAGS_DMA_ALIAS_DEVFN and > dma_alias_devfn here in the IOMMU code. > > I'm trying to figure out why we don't do something like the following > instead: > > callback(struct pci_dev *pdev, u16 alias, void *opaque) > { > struct iommu_group *group; > > group = get_pci_alias_group(); > if (group) > return group; > > return 0; > } > > pci_for_each_dma_alias(pdev, callback, ...); > > Is the existing code some sort of optimization, e.g., checking > PCI_DEV_FLAGS_DMA_ALIAS_DEVFN is cheaper than using > pci_for_each_dma_alias()? > > It seems like this won't work for some very unlikely but theoretically > possible topologies, e.g., > > PCIe Root Complex/IOMMU > PCIe switch A > PCIe to conventional PCI bridge > PCI to PCIe Root Complex > PCIe NTB > > Here, I think the IOMMU will only see RIDs from PCIe switch A, but the > current code only looks at DMA aliases that are on the same bus as the > PCIe NTB. Wouldn't using pci_for_each_dma_alias() handle this > correctly? > > > if (group) { > > pci_dev_put(tmp);
On Thu, Feb 25, 2016 at 03:41:51PM +0000, Lawrynowicz, Jacek wrote: > > -----Original Message----- > > From: Bjorn Helgaas [mailto:helgaas@kernel.org] > > Sent: Thursday, February 25, 2016 3:39 PM > > To: Bjorn Helgaas <bhelgaas@google.com> > > Cc: Lawrynowicz, Jacek <jacek.lawrynowicz@intel.com>; linux- > > pci@vger.kernel.org; Alex Williamson <alex.williamson@redhat.com>; Joerg > > Roedel <jroedel@suse.de>; David Woodhouse <dwmw2@infradead.org>; > > iommu@lists.linux-foundation.org > > Subject: Re: [PATCH v4 3/6] PCI: Add support for multiple DMA aliases > > > > On Wed, Feb 24, 2016 at 01:44:06PM -0600, Bjorn Helgaas wrote: > > > From: Jacek Lawrynowicz <jacek.lawrynowicz@intel.com> > > > > > > <Insert changelog here> > > > > (Sorry, I should have copied this changelog in the patch; I copied > > this manually from your v3 posting): > > > > > This patch solves IOMMU support issues with PCIe non-transparent bridges > > > that use Requester ID look-up tables (LUT), e.g. PEX8733. Before exiting > > > the bridge, packet's RID is rewritten according to LUT programmed by > > > a driver. Modified packets are then passed to a destination bus and > > > processed upstream. The problem is that such packets seem to come from > > > non-existent nodes that are hidden behind NTB and are not discoverable > > > by a destination node, so IOMMU discards them. Adding DMA alias for a > > > given LUT entry allows IOMMU to create a proper mapping that enables > > > inter-node communication. > > > > A specific example here would help me understand. Here's how I > > understand this (correct me if I'm wrong): We're talking about a DMA > > packet being forwarded upstream from an NTB. The NTB uses the LUT to > > rewrite the RID in the DMA packet. The new RID from the LUT is > > unknown to the IOMMU, so it discards the DMA packet. > > Yes, this is exactly the problem. > > > > The current DMA alias implementation supports only single alias, so it's > > > not possible to connect more than two nodes when IOMMU is enabled. This > > > implementation enables all possible aliases on a given bus (256) that > > > are stored in a bitset. Alias devfn is directly translated to a bit > > > number. The bitset is not allocated for devices that have no need for > > > DMA aliases. > > > > I think "two nodes" is referring to two PCIe devices on the other side > > of the NTB. You want DMA packets from those devices to have different > > RIDs so the IOMMU can distinguish them. > > Right. > > > The LUT entries basically create aliases of the NTB (one alias for > > each device beyond the NTB). Your quirk uses pci_add_dma_alias(), and > > the aliases are all on the same bus as the NTB itself. > > > > The quirk adds PCI_DEVFN(0x10, 0x0), PCI_DEVFN(0x11, 0x0), and > > PCI_DEVFN(0x12, 0x0). Shouldn't there be some connection between this > > and the LUT programming? I assume the LUT is programmed to correspond > > to those aliases. Does this mean you're limited to three devices > > beyond the NTB? > > Yes, there is an indirect connection between LUT table and devfns used in the > quirk. > Dev part is an offset in the LUT table and function is taken from the device > behind the NTB. > So the driver can only change the dev part by using different LUT offsets. > We don't plan to modify this quirk. The number of PCIe devices beyond single > x200 card NTB will not change. > Two are used by x200 CPU (host bridge & root port) and one is used by x200 DMA > engine. > I'm not sure introducing some dependencies to make sure the offsets are set > correctly is really worth it. I'd like at least a comment that points to the specific x200 code that must coordinate with this. > So regarding the improvements in the patch description, you want me to update > and repost it? Yes, please. > BTW I posted x200 DMA driver (the client for this change) on DMA list: > https://lkml.org/lkml/2016/2/9/287 > I'm working on integrating review comments and hope to get it included in 4.6. What about my questions on the code itself, below? > > > --- > > > drivers/iommu/iommu.c | 17 ++++++++++------- > > > drivers/pci/pci.c | 11 +++++++++-- > > > drivers/pci/probe.c | 1 + > > > drivers/pci/search.c | 14 +++++++++----- > > > include/linux/pci.h | 4 +--- > > > 5 files changed, 30 insertions(+), 17 deletions(-) > > > > > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > > > index 0e3b009..a214e19 100644 > > > --- a/drivers/iommu/iommu.c > > > +++ b/drivers/iommu/iommu.c > > > @@ -659,9 +659,15 @@ static struct iommu_group > > *get_pci_function_alias_group(struct pci_dev *pdev, > > > return NULL; > > > } > > > > > > +static bool dma_alias_is_enabled(struct pci_dev *dev, u8 devfn) > > > +{ > > > + return dev->dma_alias_mask && > > > + test_bit(devfn, dev->dma_alias_mask); > > > +} > > > + > > > /* > > > - * Look for aliases to or from the given device for exisiting groups. The > > > - * dma_alias_devfn only supports aliases on the same bus, therefore the > > search > > > + * Look for aliases to or from the given device for existing groups. DMA > > > + * aliases are only supported on the same bus, therefore the search > > > > I'm trying to reconcile this statement that "DMA aliases are only > > supported on the same bus" (which was there even before this patch) > > with the fact that pci_for_each_dma_alias() does not have that > > limitation. > > > > > * space is quite small (especially since we're really only looking at pcie > > > * device, and therefore only expect multiple slots on the root complex or > > > * downstream switch ports). It's conceivable though that a pair of > > > @@ -686,11 +692,8 @@ static struct iommu_group *get_pci_alias_group(struct > > pci_dev *pdev, > > > continue; > > > > > > /* We alias them or they alias us */ > > > - if (((pdev->dev_flags & PCI_DEV_FLAGS_DMA_ALIAS_DEVFN) > > && > > > - pdev->dma_alias_devfn == tmp->devfn) || > > > - ((tmp->dev_flags & PCI_DEV_FLAGS_DMA_ALIAS_DEVFN) && > > > - tmp->dma_alias_devfn == pdev->devfn)) { > > > - > > > + if (dma_alias_is_enabled(pdev, tmp->devfn) || > > > + dma_alias_is_enabled(tmp, pdev->devfn)) { > > > group = get_pci_alias_group(tmp, devfns); > > > > We basically have this: > > > > for_each_pci_dev(tmp) { > > if (<pdev and tmp are DMA aliases>) > > group = get_pci_alias_group(); > > ... > > } > > > > The DMA alias stuff relies on PCI internals, so it doesn't doesn't > > seem quite right to use things like PCI_DEV_FLAGS_DMA_ALIAS_DEVFN and > > dma_alias_devfn here in the IOMMU code. > > > > I'm trying to figure out why we don't do something like the following > > instead: > > > > callback(struct pci_dev *pdev, u16 alias, void *opaque) > > { > > struct iommu_group *group; > > > > group = get_pci_alias_group(); > > if (group) > > return group; > > > > return 0; > > } > > > > pci_for_each_dma_alias(pdev, callback, ...); > > > > Is the existing code some sort of optimization, e.g., checking > > PCI_DEV_FLAGS_DMA_ALIAS_DEVFN is cheaper than using > > pci_for_each_dma_alias()? > > > > It seems like this won't work for some very unlikely but theoretically > > possible topologies, e.g., > > > > PCIe Root Complex/IOMMU > > PCIe switch A > > PCIe to conventional PCI bridge > > PCI to PCIe Root Complex > > PCIe NTB > > > > Here, I think the IOMMU will only see RIDs from PCIe switch A, but the > > current code only looks at DMA aliases that are on the same bus as the > > PCIe NTB. Wouldn't using pci_for_each_dma_alias() handle this > > correctly? > > > > > if (group) { > > > pci_dev_put(tmp); -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Feb 29, 2016 at 04:44:17PM -0600, Bjorn Helgaas wrote: > On Thu, Feb 25, 2016 at 03:41:51PM +0000, Lawrynowicz, Jacek wrote: > > > -----Original Message----- > > > From: Bjorn Helgaas [mailto:helgaas@kernel.org] > > > Sent: Thursday, February 25, 2016 3:39 PM > > > To: Bjorn Helgaas <bhelgaas@google.com> > > > Cc: Lawrynowicz, Jacek <jacek.lawrynowicz@intel.com>; linux- > > > pci@vger.kernel.org; Alex Williamson <alex.williamson@redhat.com>; Joerg > > > Roedel <jroedel@suse.de>; David Woodhouse <dwmw2@infradead.org>; > > > iommu@lists.linux-foundation.org > > > Subject: Re: [PATCH v4 3/6] PCI: Add support for multiple DMA aliases > > > > > > On Wed, Feb 24, 2016 at 01:44:06PM -0600, Bjorn Helgaas wrote: > > > > From: Jacek Lawrynowicz <jacek.lawrynowicz@intel.com> > > What about my questions on the code itself, below? > Sorry but I'm not able to answer it. Maybe Joerg or David could help here? > > > > --- > > > > drivers/iommu/iommu.c | 17 ++++++++++------- > > > > drivers/pci/pci.c | 11 +++++++++-- > > > > drivers/pci/probe.c | 1 + > > > > drivers/pci/search.c | 14 +++++++++----- > > > > include/linux/pci.h | 4 +--- > > > > 5 files changed, 30 insertions(+), 17 deletions(-) > > > > > > > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > > > > index 0e3b009..a214e19 100644 > > > > --- a/drivers/iommu/iommu.c > > > > +++ b/drivers/iommu/iommu.c > > > > @@ -659,9 +659,15 @@ static struct iommu_group > > > *get_pci_function_alias_group(struct pci_dev *pdev, > > > > return NULL; > > > > } > > > > > > > > +static bool dma_alias_is_enabled(struct pci_dev *dev, u8 devfn) > > > > +{ > > > > + return dev->dma_alias_mask && > > > > + test_bit(devfn, dev->dma_alias_mask); > > > > +} > > > > + > > > > /* > > > > - * Look for aliases to or from the given device for exisiting groups. The > > > > - * dma_alias_devfn only supports aliases on the same bus, therefore the > > > search > > > > + * Look for aliases to or from the given device for existing groups. DMA > > > > + * aliases are only supported on the same bus, therefore the search > > > > > > I'm trying to reconcile this statement that "DMA aliases are only > > > supported on the same bus" (which was there even before this patch) > > > with the fact that pci_for_each_dma_alias() does not have that > > > limitation. > > > > > > > * space is quite small (especially since we're really only looking at pcie > > > > * device, and therefore only expect multiple slots on the root complex or > > > > * downstream switch ports). It's conceivable though that a pair of > > > > @@ -686,11 +692,8 @@ static struct iommu_group *get_pci_alias_group(struct > > > pci_dev *pdev, > > > > continue; > > > > > > > > /* We alias them or they alias us */ > > > > - if (((pdev->dev_flags & PCI_DEV_FLAGS_DMA_ALIAS_DEVFN) > > > && > > > > - pdev->dma_alias_devfn == tmp->devfn) || > > > > - ((tmp->dev_flags & PCI_DEV_FLAGS_DMA_ALIAS_DEVFN) && > > > > - tmp->dma_alias_devfn == pdev->devfn)) { > > > > - > > > > + if (dma_alias_is_enabled(pdev, tmp->devfn) || > > > > + dma_alias_is_enabled(tmp, pdev->devfn)) { > > > > group = get_pci_alias_group(tmp, devfns); > > > > > > We basically have this: > > > > > > for_each_pci_dev(tmp) { > > > if (<pdev and tmp are DMA aliases>) > > > group = get_pci_alias_group(); > > > ... > > > } > > > > > > The DMA alias stuff relies on PCI internals, so it doesn't doesn't > > > seem quite right to use things like PCI_DEV_FLAGS_DMA_ALIAS_DEVFN and > > > dma_alias_devfn here in the IOMMU code. > > > > > > I'm trying to figure out why we don't do something like the following > > > instead: > > > > > > callback(struct pci_dev *pdev, u16 alias, void *opaque) > > > { > > > struct iommu_group *group; > > > > > > group = get_pci_alias_group(); > > > if (group) > > > return group; > > > > > > return 0; > > > } > > > > > > pci_for_each_dma_alias(pdev, callback, ...); > > > > > > Is the existing code some sort of optimization, e.g., checking > > > PCI_DEV_FLAGS_DMA_ALIAS_DEVFN is cheaper than using > > > pci_for_each_dma_alias()? > > > > > > It seems like this won't work for some very unlikely but theoretically > > > possible topologies, e.g., > > > > > > PCIe Root Complex/IOMMU > > > PCIe switch A > > > PCIe to conventional PCI bridge > > > PCI to PCIe Root Complex > > > PCIe NTB > > > > > > Here, I think the IOMMU will only see RIDs from PCIe switch A, but the > > > current code only looks at DMA aliases that are on the same bus as the > > > PCIe NTB. Wouldn't using pci_for_each_dma_alias() handle this > > > correctly? > > > > > > > if (group) { > > > > pci_dev_put(tmp); > > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2016-02-25 at 08:38 -0600, Bjorn Helgaas wrote: > > > /* > > - * Look for aliases to or from the given device for exisiting groups. The > > - * dma_alias_devfn only supports aliases on the same bus, therefore the search > > + * Look for aliases to or from the given device for existing groups. DMA > > + * aliases are only supported on the same bus, therefore the search > > I'm trying to reconcile this statement that "DMA aliases are only > supported on the same bus" (which was there even before this patch) > with the fact that pci_for_each_dma_alias() does not have that > limitation. Doesn't it? You can still only set a DMA alias on the same bus with pci_add_dma_alias(), can't you? > > * space is quite small (especially since we're really only looking at pcie > > * device, and therefore only expect multiple slots on the root complex or > > * downstream switch ports). It's conceivable though that a pair of > > @@ -686,11 +692,8 @@ static struct iommu_group *get_pci_alias_group(struct pci_dev *pdev, > > continue; > > > > /* We alias them or they alias us */ > > - if (((pdev->dev_flags & PCI_DEV_FLAGS_DMA_ALIAS_DEVFN) && > > - pdev->dma_alias_devfn == tmp->devfn) || > > - ((tmp->dev_flags & PCI_DEV_FLAGS_DMA_ALIAS_DEVFN) && > > - tmp->dma_alias_devfn == pdev->devfn)) { > > - > > + if (dma_alias_is_enabled(pdev, tmp->devfn) || > > + dma_alias_is_enabled(tmp, pdev->devfn)) { > > group = get_pci_alias_group(tmp, devfns); > > We basically have this: > > for_each_pci_dev(tmp) { > if () > group = get_pci_alias_group(); > ... > } Strictly, that's: for_each_pci_dev(tmp) { if (pdev is an alias of tmp || tmp is an alias of pdev) group = get_pci_alias_group(); ... } > The DMA alias stuff relies on PCI internals, so it doesn't doesn't > seem quite right to use things like PCI_DEV_FLAGS_DMA_ALIAS_DEVFN and > dma_alias_devfn here in the IOMMU code. > > I'm trying to figure out why we don't do something like the following > instead: > > callback(struct pci_dev *pdev, u16 alias, void *opaque) > { > struct iommu_group *group; > > group = get_pci_alias_group(); > if (group) > return group; > > return 0; > } > > pci_for_each_dma_alias(pdev, callback, ...); And this would be equivalent to for_each_pci_dev(tmp) { if (tmp is an alias of pdev) group = get_pci_alias_group(); ... } The "is an alias of" property is not commutative. Perhaps it should be. But that's hard because in some cases the alias doesn't even *exist* as a real PCI device. It's just that you appear to get DMA transactions from a given source-id. -- dwmw2
On Mon, Mar 14, 2016 at 10:43:40PM +0000, David Woodhouse wrote: > On Thu, 2016-02-25 at 08:38 -0600, Bjorn Helgaas wrote: > > > > > /* > > > - * Look for aliases to or from the given device for exisiting groups. The > > > - * dma_alias_devfn only supports aliases on the same bus, therefore the search > > > + * Look for aliases to or from the given device for existing groups. DMA > > > + * aliases are only supported on the same bus, therefore the search > > > > I'm trying to reconcile this statement that "DMA aliases are only > > supported on the same bus" (which was there even before this patch) > > with the fact that pci_for_each_dma_alias() does not have that > > limitation. > > Doesn't it? You can still only set a DMA alias on the same bus with > pci_add_dma_alias(), can't you? I guess it's true that PCI_DEV_FLAGS_DMA_ALIAS_DEVFN and the proposed pci_add_dma_alias() only add aliases on the same bus. I was thinking about a scenario like this: 00:00.0 PCIe-to-PCI bridge to [bus 01] 01:01.0 conventional PCI device where I think 01:00.0 is a DMA alias for 01:01.0 because the bridge takes ownership of DMA transactions from 01:01.0 and assigns a Requester ID of 01:00.0 (secondary bus number, device 0, function 0). > > > * space is quite small (especially since we're really only looking at pcie > > > * device, and therefore only expect multiple slots on the root complex or > > > * downstream switch ports). It's conceivable though that a pair of > > > @@ -686,11 +692,8 @@ static struct iommu_group *get_pci_alias_group(struct pci_dev *pdev, > > > continue; > > > > > > /* We alias them or they alias us */ > > > - if (((pdev->dev_flags & PCI_DEV_FLAGS_DMA_ALIAS_DEVFN) && > > > - pdev->dma_alias_devfn == tmp->devfn) || > > > - ((tmp->dev_flags & PCI_DEV_FLAGS_DMA_ALIAS_DEVFN) && > > > - tmp->dma_alias_devfn == pdev->devfn)) { > > > - > > > + if (dma_alias_is_enabled(pdev, tmp->devfn) || > > > + dma_alias_is_enabled(tmp, pdev->devfn)) { > > > group = get_pci_alias_group(tmp, devfns); > > > > We basically have this: > > > > for_each_pci_dev(tmp) { > > if () > > group = get_pci_alias_group(); > > ... > > } > > Strictly, that's: > > for_each_pci_dev(tmp) { > if (pdev is an alias of tmp || tmp is an alias of pdev) > group = get_pci_alias_group(); > ... > } OK. > > I'm trying to figure out why we don't do something like the following > > instead: > > > > callback(struct pci_dev *pdev, u16 alias, void *opaque) > > { > > struct iommu_group *group; > > > > group = get_pci_alias_group(); > > if (group) > > return group; > > > > return 0; > > } > > > > pci_for_each_dma_alias(pdev, callback, ...); > > And this would be equivalent to > > for_each_pci_dev(tmp) { > if (tmp is an alias of pdev) > group = get_pci_alias_group(); > ... > } > > The "is an alias of" property is not commutative. Perhaps it should be. > But that's hard because in some cases the alias doesn't even *exist* as > a real PCI device. It's just that you appear to get DMA transactions > from a given source-id. Right. In my example above, 01:00.0 is not a PCI device; it's only a Requester ID that is fabricated by the bridge when it forwards DMA transactions upstream. I think I'm confused because I don't really understand IOMMU groups. Let me explain what I think they are and you can correct me when I go wrong. The iommu_group_alloc() comment says "The IOMMU group represents the minimum granularity of the IOMMU." So I suppose the IOMMU cannot distinguish between devices in a group. All the devices in the group use the same set of DMA mappings. Granting device A DMA access to a buffer grants the same access to all other members of A's IOMMU group. That would mean my question was fundamentally backwards. In get_pci_alias_group(A), we're not trying to figure out what all the aliases of A are, which is what pci_for_each_dma_alias() does. Instead, we're trying to figure out which IOMMU group A belongs to. But I still don't quite understand how aliases fit into this. Let's go back to my example and assume we've already put 00:00.0 and 01:01.0 in IOMMU groups: 00:00.0 PCIe-to-PCI bridge to [bus 01] # in IOMMU group G0 01:01.0 conventional PCI device # in IOMMU group G1 I assume these devices are in different IOMMU groups because if the bridge generated its own DMA, it would use Requester ID 00:00.0, which is distinct from the 01:00.0 it would use when forwarding DMAs from its secondary side. What happens when we add 01:02.0? I think 01:01.0 and 01:02.0 should both end up in IOMMU group G1 because the IOMMU will see only Requester ID 01:00.0, so it can't distinguish them. When we add 01:02.0, the ops->add_device() ... ops->device_group() path calls pci_device_group(01:02.0): pci_device_group(01:02.0) pci_for_each_dma_alias(01:02.0, get_pci_alias_or_group) get_pci_alias_or_group(01:02.0, 01:02.0) # callback return 0 # 01:02.0 group not set yet get_pci_alias_or_group(00:00.0, 01:00.0) # callback return 1 # 00:00.0 is in G0 It seems like we'll assign 01:02.0 to group G0, when I think it should be in G1. Where did I go wrong? Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 15, 2016 at 07:48:17PM -0500, Bjorn Helgaas wrote: > On Mon, Mar 14, 2016 at 10:43:40PM +0000, David Woodhouse wrote: > > On Thu, 2016-02-25 at 08:38 -0600, Bjorn Helgaas wrote: > > > > > > > /* > > > > - * Look for aliases to or from the given device for exisiting groups. The > > > > - * dma_alias_devfn only supports aliases on the same bus, therefore the search > > > > + * Look for aliases to or from the given device for existing groups. DMA > > > > + * aliases are only supported on the same bus, therefore the search > > > > > > I'm trying to reconcile this statement that "DMA aliases are only > > > supported on the same bus" (which was there even before this patch) > > > with the fact that pci_for_each_dma_alias() does not have that > > > limitation. > > > > Doesn't it? You can still only set a DMA alias on the same bus with > > pci_add_dma_alias(), can't you? > > I guess it's true that PCI_DEV_FLAGS_DMA_ALIAS_DEVFN and the proposed > pci_add_dma_alias() only add aliases on the same bus. I was thinking > about a scenario like this: > > 00:00.0 PCIe-to-PCI bridge to [bus 01] > 01:01.0 conventional PCI device > > where I think 01:00.0 is a DMA alias for 01:01.0 because the bridge > takes ownership of DMA transactions from 01:01.0 and assigns a > Requester ID of 01:00.0 (secondary bus number, device 0, function 0). > > > > > * space is quite small (especially since we're really only looking at pcie > > > > * device, and therefore only expect multiple slots on the root complex or > > > > * downstream switch ports). It's conceivable though that a pair of > > > > @@ -686,11 +692,8 @@ static struct iommu_group *get_pci_alias_group(struct pci_dev *pdev, > > > > continue; > > > > > > > > /* We alias them or they alias us */ > > > > - if (((pdev->dev_flags & PCI_DEV_FLAGS_DMA_ALIAS_DEVFN) && > > > > - pdev->dma_alias_devfn == tmp->devfn) || > > > > - ((tmp->dev_flags & PCI_DEV_FLAGS_DMA_ALIAS_DEVFN) && > > > > - tmp->dma_alias_devfn == pdev->devfn)) { > > > > - > > > > + if (dma_alias_is_enabled(pdev, tmp->devfn) || > > > > + dma_alias_is_enabled(tmp, pdev->devfn)) { > > > > group = get_pci_alias_group(tmp, devfns); > > > > > > We basically have this: > > > > > > for_each_pci_dev(tmp) { > > > if () > > > group = get_pci_alias_group(); > > > ... > > > } > > > > Strictly, that's: > > > > for_each_pci_dev(tmp) { > > if (pdev is an alias of tmp || tmp is an alias of pdev) > > group = get_pci_alias_group(); > > ... > > } > > OK. > > > > I'm trying to figure out why we don't do something like the following > > > instead: > > > > > > callback(struct pci_dev *pdev, u16 alias, void *opaque) > > > { > > > struct iommu_group *group; > > > > > > group = get_pci_alias_group(); > > > if (group) > > > return group; > > > > > > return 0; > > > } > > > > > > pci_for_each_dma_alias(pdev, callback, ...); > > > > And this would be equivalent to > > > > for_each_pci_dev(tmp) { > > if (tmp is an alias of pdev) > > group = get_pci_alias_group(); > > ... > > } > > > > The "is an alias of" property is not commutative. Perhaps it should be. > > But that's hard because in some cases the alias doesn't even *exist* as > > a real PCI device. It's just that you appear to get DMA transactions > > from a given source-id. > > Right. In my example above, 01:00.0 is not a PCI device; it's only a > Requester ID that is fabricated by the bridge when it forwards DMA > transactions upstream. > > I think I'm confused because I don't really understand IOMMU groups. > > Let me explain what I think they are and you can correct me when I go > wrong. The iommu_group_alloc() comment says "The IOMMU group > represents the minimum granularity of the IOMMU." So I suppose the > IOMMU cannot distinguish between devices in a group. All the devices > in the group use the same set of DMA mappings. Granting device A DMA > access to a buffer grants the same access to all other members of A's > IOMMU group. > > That would mean my question was fundamentally backwards. In > get_pci_alias_group(A), we're not trying to figure out what all the > aliases of A are, which is what pci_for_each_dma_alias() does. > > Instead, we're trying to figure out which IOMMU group A belongs to. > But I still don't quite understand how aliases fit into this. Let's > go back to my example and assume we've already put 00:00.0 and 01:01.0 > in IOMMU groups: > > 00:00.0 PCIe-to-PCI bridge to [bus 01] # in IOMMU group G0 > 01:01.0 conventional PCI device # in IOMMU group G1 > > I assume these devices are in different IOMMU groups because if the > bridge generated its own DMA, it would use Requester ID 00:00.0, which > is distinct from the 01:00.0 it would use when forwarding DMAs from > its secondary side. > > What happens when we add 01:02.0? I think 01:01.0 and 01:02.0 should > both end up in IOMMU group G1 because the IOMMU will see only > Requester ID 01:00.0, so it can't distinguish them. > > When we add 01:02.0, the ops->add_device() ... ops->device_group() > path calls pci_device_group(01:02.0): > > pci_device_group(01:02.0) > pci_for_each_dma_alias(01:02.0, get_pci_alias_or_group) > get_pci_alias_or_group(01:02.0, 01:02.0) # callback > return 0 # 01:02.0 group not set yet > get_pci_alias_or_group(00:00.0, 01:00.0) # callback > return 1 # 00:00.0 is in G0 > > It seems like we'll assign 01:02.0 to group G0, when I think it should > be in G1. Where did I go wrong? Ping? -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2016-04-08 at 11:06 -0500, Bjorn Helgaas wrote: > > I think I'm confused because I don't really understand IOMMU > > groups. ,,, > Ping? I think this ended up being more of a Alex question...
On Fri, 8 Apr 2016 11:06:32 -0500 Bjorn Helgaas <helgaas@kernel.org> wrote: > On Tue, Mar 15, 2016 at 07:48:17PM -0500, Bjorn Helgaas wrote: > > On Mon, Mar 14, 2016 at 10:43:40PM +0000, David Woodhouse wrote: > > > On Thu, 2016-02-25 at 08:38 -0600, Bjorn Helgaas wrote: > > > > > > > > > /* > > > > > - * Look for aliases to or from the given device for exisiting groups. The > > > > > - * dma_alias_devfn only supports aliases on the same bus, therefore the search > > > > > + * Look for aliases to or from the given device for existing groups. DMA > > > > > + * aliases are only supported on the same bus, therefore the search > > > > > > > > I'm trying to reconcile this statement that "DMA aliases are only > > > > supported on the same bus" (which was there even before this patch) > > > > with the fact that pci_for_each_dma_alias() does not have that > > > > limitation. > > > > > > Doesn't it? You can still only set a DMA alias on the same bus with > > > pci_add_dma_alias(), can't you? > > > > I guess it's true that PCI_DEV_FLAGS_DMA_ALIAS_DEVFN and the proposed > > pci_add_dma_alias() only add aliases on the same bus. I was thinking > > about a scenario like this: > > > > 00:00.0 PCIe-to-PCI bridge to [bus 01] > > 01:01.0 conventional PCI device > > > > where I think 01:00.0 is a DMA alias for 01:01.0 because the bridge > > takes ownership of DMA transactions from 01:01.0 and assigns a > > Requester ID of 01:00.0 (secondary bus number, device 0, function 0). This is true, but this is a topology alias, not a quirk. pci_for_each_dma_alias() already handles this case. It would trigger the callback function for any direct alias of the conventional device as well as the bridge itself. Obviously we don't end up with any quirks for conventional devices because the aliases are masked behind the bridge anyway. > > > > > * space is quite small (especially since we're really only looking at pcie > > > > > * device, and therefore only expect multiple slots on the root complex or > > > > > * downstream switch ports). It's conceivable though that a pair of > > > > > @@ -686,11 +692,8 @@ static struct iommu_group *get_pci_alias_group(struct pci_dev *pdev, > > > > > continue; > > > > > > > > > > /* We alias them or they alias us */ > > > > > - if (((pdev->dev_flags & PCI_DEV_FLAGS_DMA_ALIAS_DEVFN) && > > > > > - pdev->dma_alias_devfn == tmp->devfn) || > > > > > - ((tmp->dev_flags & PCI_DEV_FLAGS_DMA_ALIAS_DEVFN) && > > > > > - tmp->dma_alias_devfn == pdev->devfn)) { > > > > > - > > > > > + if (dma_alias_is_enabled(pdev, tmp->devfn) || > > > > > + dma_alias_is_enabled(tmp, pdev->devfn)) { > > > > > group = get_pci_alias_group(tmp, devfns); > > > > > > > > We basically have this: > > > > > > > > for_each_pci_dev(tmp) { > > > > if () > > > > group = get_pci_alias_group(); > > > > ... > > > > } > > > > > > Strictly, that's: > > > > > > for_each_pci_dev(tmp) { > > > if (pdev is an alias of tmp || tmp is an alias of pdev) > > > group = get_pci_alias_group(); > > > ... > > > } > > > > OK. > > > > > > I'm trying to figure out why we don't do something like the following > > > > instead: > > > > > > > > callback(struct pci_dev *pdev, u16 alias, void *opaque) > > > > { > > > > struct iommu_group *group; > > > > > > > > group = get_pci_alias_group(); > > > > if (group) > > > > return group; > > > > > > > > return 0; > > > > } > > > > > > > > pci_for_each_dma_alias(pdev, callback, ...); > > > > > > And this would be equivalent to > > > > > > for_each_pci_dev(tmp) { > > > if (tmp is an alias of pdev) > > > group = get_pci_alias_group(); > > > ... > > > } > > > > > > The "is an alias of" property is not commutative. Perhaps it should be. > > > But that's hard because in some cases the alias doesn't even *exist* as > > > a real PCI device. It's just that you appear to get DMA transactions > > > from a given source-id. > > > > Right. In my example above, 01:00.0 is not a PCI device; it's only a > > Requester ID that is fabricated by the bridge when it forwards DMA > > transactions upstream. > > > > I think I'm confused because I don't really understand IOMMU groups. > > > > Let me explain what I think they are and you can correct me when I go > > wrong. The iommu_group_alloc() comment says "The IOMMU group > > represents the minimum granularity of the IOMMU." So I suppose the > > IOMMU cannot distinguish between devices in a group. All the devices > > in the group use the same set of DMA mappings. Granting device A DMA > > access to a buffer grants the same access to all other members of A's > > IOMMU group. > > > > That would mean my question was fundamentally backwards. In > > get_pci_alias_group(A), we're not trying to figure out what all the > > aliases of A are, which is what pci_for_each_dma_alias() does. > > > > Instead, we're trying to figure out which IOMMU group A belongs to. > > But I still don't quite understand how aliases fit into this. Let's > > go back to my example and assume we've already put 00:00.0 and 01:01.0 > > in IOMMU groups: > > > > 00:00.0 PCIe-to-PCI bridge to [bus 01] # in IOMMU group G0 > > 01:01.0 conventional PCI device # in IOMMU group G1 > > > > I assume these devices are in different IOMMU groups because if the > > bridge generated its own DMA, it would use Requester ID 00:00.0, which > > is distinct from the 01:00.0 it would use when forwarding DMAs from > > its secondary side. The actual requester ID of the bridge depends on quirks as well, see PCI_DEV_FLAG_PCIE_BRIDGE_ALIAS, so it might either be the bridge itself or the subordinate bus address. However, trace how these groups would be created, the bridge device is discovered first, so we call pci_device_group() for it. since it's on the root bus and it's the 0th function, there will be no existing groups for it to alias, so it gets a new group. Then we discover 01:01.0, pci_device_group() calls pci_for_each_dma_alias(), which hits the bridge, regardless of which alias is used. So 01:01.0 ends up in the same iommu group as the bridge. From the purist standpoint of iommu groups, your expectations are probably correct, but we also include within IOMMU groups DMA isolation between devices. So if two devices can DMA between each other without it necessarily passing through the iommu, the devices are grouped together. The more important cases of this are ACS isolation on PCI-e devices, but it still sort of applies to this conventional PCI example here. > > What happens when we add 01:02.0? I think 01:01.0 and 01:02.0 > > should both end up in IOMMU group G1 because the IOMMU will see only > > Requester ID 01:00.0, so it can't distinguish them. > > > > When we add 01:02.0, the ops->add_device() ... ops->device_group() > > path calls pci_device_group(01:02.0): > > > > pci_device_group(01:02.0) > > pci_for_each_dma_alias(01:02.0, get_pci_alias_or_group) > > get_pci_alias_or_group(01:02.0, 01:02.0) # callback > > return 0 # 01:02.0 group not set yet > > get_pci_alias_or_group(00:00.0, 01:00.0) # callback > > return 1 # 00:00.0 is in G0 > > > > It seems like we'll assign 01:02.0 to group G0, when I think it > > should be in G1. Where did I go wrong? In fact they're all part of G0, so you went wrong assuming the initial grouping rather than applying the same rules and tracing how 01:01.0 is added. It's not an ideal representation, it tends to be more conservative than necessary in some cases, but keeping the group attached to devices has advantages in being able to search and find points like this where we don't necessarily have access to the group behind the bridge without tying it to the bridge itself. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 0e3b009..a214e19 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -659,9 +659,15 @@ static struct iommu_group *get_pci_function_alias_group(struct pci_dev *pdev, return NULL; } +static bool dma_alias_is_enabled(struct pci_dev *dev, u8 devfn) +{ + return dev->dma_alias_mask && + test_bit(devfn, dev->dma_alias_mask); +} + /* - * Look for aliases to or from the given device for exisiting groups. The - * dma_alias_devfn only supports aliases on the same bus, therefore the search + * Look for aliases to or from the given device for existing groups. DMA + * aliases are only supported on the same bus, therefore the search * space is quite small (especially since we're really only looking at pcie * device, and therefore only expect multiple slots on the root complex or * downstream switch ports). It's conceivable though that a pair of @@ -686,11 +692,8 @@ static struct iommu_group *get_pci_alias_group(struct pci_dev *pdev, continue; /* We alias them or they alias us */ - if (((pdev->dev_flags & PCI_DEV_FLAGS_DMA_ALIAS_DEVFN) && - pdev->dma_alias_devfn == tmp->devfn) || - ((tmp->dev_flags & PCI_DEV_FLAGS_DMA_ALIAS_DEVFN) && - tmp->dma_alias_devfn == pdev->devfn)) { - + if (dma_alias_is_enabled(pdev, tmp->devfn) || + dma_alias_is_enabled(tmp, pdev->devfn)) { group = get_pci_alias_group(tmp, devfns); if (group) { pci_dev_put(tmp); diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 8b0a637..eb4dd49 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -4578,8 +4578,15 @@ int pci_set_vga_state(struct pci_dev *dev, bool decode, */ void pci_add_dma_alias(struct pci_dev *dev, u8 devfn) { - dev->dma_alias_devfn = PCI_DEVFN(PCI_SLOT(dev->devfn), 0); - dev->dev_flags |= PCI_DEV_FLAGS_DMA_ALIAS_DEVFN; + if (!dev->dma_alias_mask) + dev->dma_alias_mask = kcalloc(BITS_TO_LONGS(U8_MAX), + sizeof(long), GFP_KERNEL); + if (!dev->dma_alias_mask) { + dev_warn(&dev->dev, "Unable to allocate DMA alias mask\n"); + return; + } + + set_bit(devfn, dev->dma_alias_mask); dev_info(&dev->dev, "Enabling fixed DMA alias to %02x.%d\n", PCI_SLOT(devfn), PCI_FUNC(devfn)); } diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 6d7ab9b..23fc397 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1503,6 +1503,7 @@ static void pci_release_dev(struct device *dev) pcibios_release_device(pci_dev); pci_bus_put(pci_dev->bus); kfree(pci_dev->driver_override); + kfree(pci_dev->dma_alias_mask); kfree(pci_dev); } diff --git a/drivers/pci/search.c b/drivers/pci/search.c index a20ce7d..33e0f03 100644 --- a/drivers/pci/search.c +++ b/drivers/pci/search.c @@ -40,11 +40,15 @@ int pci_for_each_dma_alias(struct pci_dev *pdev, * If the device is broken and uses an alias requester ID for * DMA, iterate over that too. */ - if (unlikely(pdev->dev_flags & PCI_DEV_FLAGS_DMA_ALIAS_DEVFN)) { - ret = fn(pdev, PCI_DEVID(pdev->bus->number, - pdev->dma_alias_devfn), data); - if (ret) - return ret; + if (unlikely(pdev->dma_alias_mask)) { + u8 devfn; + + for_each_set_bit(devfn, pdev->dma_alias_mask, U8_MAX) { + ret = fn(pdev, PCI_DEVID(pdev->bus->number, devfn), + data); + if (ret) + return ret; + } } for (bus = pdev->bus; !pci_is_root_bus(bus); bus = bus->parent) { diff --git a/include/linux/pci.h b/include/linux/pci.h index 27df4a6..d9e0c84 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -172,8 +172,6 @@ enum pci_dev_flags { PCI_DEV_FLAGS_ASSIGNED = (__force pci_dev_flags_t) (1 << 2), /* Flag for quirk use to store if quirk-specific ACS is enabled */ PCI_DEV_FLAGS_ACS_ENABLED_QUIRK = (__force pci_dev_flags_t) (1 << 3), - /* Flag to indicate the device uses dma_alias_devfn */ - PCI_DEV_FLAGS_DMA_ALIAS_DEVFN = (__force pci_dev_flags_t) (1 << 4), /* Use a PCIe-to-PCI bridge alias even if !pci_is_pcie */ PCI_DEV_FLAG_PCIE_BRIDGE_ALIAS = (__force pci_dev_flags_t) (1 << 5), /* Do not use bus resets for device */ @@ -279,7 +277,7 @@ struct pci_dev { u8 rom_base_reg; /* which config register controls the ROM */ u8 pin; /* which interrupt pin this device uses */ u16 pcie_flags_reg; /* cached PCIe Capabilities Register */ - u8 dma_alias_devfn;/* devfn of DMA alias, if any */ + unsigned long *dma_alias_mask;/* mask of enabled devfn aliases */ struct pci_driver *driver; /* which driver has allocated this device */ u64 dma_mask; /* Mask of the bits of bus address this
From: Jacek Lawrynowicz <jacek.lawrynowicz@intel.com> <Insert changelog here> --- drivers/iommu/iommu.c | 17 ++++++++++------- drivers/pci/pci.c | 11 +++++++++-- drivers/pci/probe.c | 1 + drivers/pci/search.c | 14 +++++++++----- include/linux/pci.h | 4 +--- 5 files changed, 30 insertions(+), 17 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html