diff mbox

[1/4] arm64: gicv3: its: Encode domain number in PCI stream id

Message ID 1430686172-18222-2-git-send-email-rric@kernel.org (mailing list archive)
State New, archived
Headers show

Commit Message

Robert Richter May 3, 2015, 8:49 p.m. UTC
From: Tirumalesh Chalamarla <tchalamarla@cavium.com>

PCI stream ids need to consider pci bridge number to be unique on the
system. Using only bus and devfn can't do the trick in systems that
have multiple pci bridges.

Signed-off-by: Tirumalesh Chalamarla <tchalamarla@cavium.com>
Signed-off-by: Robert Richter <rrichter@cavium.com>
---
 drivers/irqchip/irq-gic-v3-its.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Marc Zyngier May 20, 2015, 12:11 p.m. UTC | #1
On Sun, 3 May 2015 21:49:29 +0100
Robert Richter <rric@kernel.org> wrote:

> From: Tirumalesh Chalamarla <tchalamarla@cavium.com>
> 
> PCI stream ids need to consider pci bridge number to be unique on the
> system. Using only bus and devfn can't do the trick in systems that
> have multiple pci bridges.
> 
> Signed-off-by: Tirumalesh Chalamarla <tchalamarla@cavium.com>
> Signed-off-by: Robert Richter <rrichter@cavium.com>
> ---
>  drivers/irqchip/irq-gic-v3-its.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 9687f8afebff..e30b4de04c6c 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -1186,7 +1186,7 @@ static int its_get_pci_alias(struct pci_dev *pdev, u16 alias, void *data)
>  {
>  	struct its_pci_alias *dev_alias = data;
>  
> -	dev_alias->dev_id = alias;
> +	dev_alias->dev_id = (pci_domain_nr(pdev->bus) << 16) | alias;
>  	if (pdev != dev_alias->pdev)
>  		dev_alias->count += its_pci_msi_vec_count(dev_alias->pdev);
>  

This feels very scary. We're now assuming that the domain number will
always be presented to the doorbell. What guarantee do we have that
this is always the case, irrespective of the platform?

Also, domains have no PCI reality, they are a Linux thing. And they can
be "randomly" assigned, unless you force the domain in DT with a
linux,pci-domain property. This looks even more wrong, specially
considering ACPI.

It really feels like we need a way to describe how the BDF numbering is
augmented. We also need to guarantee that we get the actual bridge
number, as opposed to the domain number.

Thoughts?

	M.
Robert Richter May 20, 2015, 12:48 p.m. UTC | #2
Mark,

thanks for review, also of the other patches of this series.

See below

On 20.05.15 13:11:38, Marc Zyngier wrote:
> > -	dev_alias->dev_id = alias;
> > +	dev_alias->dev_id = (pci_domain_nr(pdev->bus) << 16) | alias;

> This feels very scary. We're now assuming that the domain number will
> always be presented to the doorbell. What guarantee do we have that
> this is always the case, irrespective of the platform?
> 
> Also, domains have no PCI reality, they are a Linux thing. And they can
> be "randomly" assigned, unless you force the domain in DT with a
> linux,pci-domain property. This looks even more wrong, specially
> considering ACPI.

The main problem here is that device ids (32 bits) are system
specific. Since we have more than one PCI root complex we need the
upper 16 bits in the devid for mapping. Using pci_domain_nr for this
fits our needs for now and shouldn't affect systems with a single RC
only as the domain nr is zero then.

The domain number is incremented during initialization beginnig with
zero and the order of it is fixed since it is taken from DT or ACPI
tables. So we have full controll of it. I don't see issues here.

> It really feels like we need a way to describe how the BDF numbering is
> augmented. We also need to guarantee that we get the actual bridge
> number, as opposed to the domain number.

But true, the obove is just intermediate. In the end we need some sort
of handler that is setup during cpu initialization that registers a
callback for the gic to determine the device id of that paricular
system.

-Robert
Marc Zyngier May 22, 2015, 8:26 a.m. UTC | #3
On 20/05/15 13:48, Robert Richter wrote:
> Mark,
> 
> thanks for review, also of the other patches of this series.
> 
> See below
> 
> On 20.05.15 13:11:38, Marc Zyngier wrote:
>>> -	dev_alias->dev_id = alias;
>>> +	dev_alias->dev_id = (pci_domain_nr(pdev->bus) << 16) | alias;
> 
>> This feels very scary. We're now assuming that the domain number will
>> always be presented to the doorbell. What guarantee do we have that
>> this is always the case, irrespective of the platform?
>>
>> Also, domains have no PCI reality, they are a Linux thing. And they can
>> be "randomly" assigned, unless you force the domain in DT with a
>> linux,pci-domain property. This looks even more wrong, specially
>> considering ACPI.
> 
> The main problem here is that device ids (32 bits) are system
> specific. Since we have more than one PCI root complex we need the
> upper 16 bits in the devid for mapping. Using pci_domain_nr for this
> fits our needs for now and shouldn't affect systems with a single RC
> only as the domain nr is zero then.
> 
> The domain number is incremented during initialization beginnig with
> zero and the order of it is fixed since it is taken from DT or ACPI
> tables. So we have full controll of it. I don't see issues here.

This may match what you have on ThunderX (as long as the kernel doesn't
adopt another behaviour when allocating the domain number). But other
platforms may have a completely different numbering, which will mess
them up entirely.

>> It really feels like we need a way to describe how the BDF numbering is
>> augmented. We also need to guarantee that we get the actual bridge
>> number, as opposed to the domain number.
> 
> But true, the obove is just intermediate. In the end we need some sort
> of handler that is setup during cpu initialization that registers a
> callback for the gic to determine the device id of that paricular
> system.

I don't really like the idea of a callback from the GIC - I'd prefer it
to be standalone, and rely on the topology information to build the
DeviceID. Mark Rutland had some ideas for DT (he posted an RFC a while
ago), maybe it would be good to get back to that and find out what we
can do. ACPI should also have similar information (IORT?).

Thanks,

	M.
Chalamarla, Tirumalesh May 22, 2015, 10:57 p.m. UTC | #4
> On May 22, 2015, at 1:26 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> 
> On 20/05/15 13:48, Robert Richter wrote:
>> Mark,
>> 
>> thanks for review, also of the other patches of this series.
>> 
>> See below
>> 
>> On 20.05.15 13:11:38, Marc Zyngier wrote:
>>>> -	dev_alias->dev_id = alias;
>>>> +	dev_alias->dev_id = (pci_domain_nr(pdev->bus) << 16) | alias;
>> 
>>> This feels very scary. We're now assuming that the domain number will
>>> always be presented to the doorbell. What guarantee do we have that
>>> this is always the case, irrespective of the platform?
>>> 
>>> Also, domains have no PCI reality, they are a Linux thing. And they can
>>> be "randomly" assigned, unless you force the domain in DT with a
>>> linux,pci-domain property. This looks even more wrong, specially
>>> considering ACPI.
>> 
>> The main problem here is that device ids (32 bits) are system
>> specific. Since we have more than one PCI root complex we need the
>> upper 16 bits in the devid for mapping. Using pci_domain_nr for this
>> fits our needs for now and shouldn't affect systems with a single RC
>> only as the domain nr is zero then.
>> 
>> The domain number is incremented during initialization beginnig with
>> zero and the order of it is fixed since it is taken from DT or ACPI
>> tables. So we have full controll of it. I don't see issues here.
> 
> This may match what you have on ThunderX (as long as the kernel doesn't
> adopt another behaviour when allocating the domain number). But other
> platforms may have a completely different numbering, which will mess
> them up entirely.
> 
>>> It really feels like we need a way to describe how the BDF numbering is
>>> augmented. We also need to guarantee that we get the actual bridge
>>> number, as opposed to the domain number.
>> 
>> But true, the obove is just intermediate. In the end we need some sort
>> of handler that is setup during cpu initialization that registers a
>> callback for the gic to determine the device id of that paricular
>> system.
> 
> I don't really like the idea of a callback from the GIC - I'd prefer it
> to be standalone, and rely on the topology information to build the
> DeviceID. Mark Rutland had some ideas for DT (he posted an RFC a while
> ago), maybe it would be good to get back to that and find out what we
> can do. ACPI should also have similar information (IORT?).
> 

How can some one pass this from DT, especially in GIC entry. i still think it is bus owner responsibility and call back is better idea. 
but if some one has a better idea for DT and ACPI, we are fine as long as it works on ThunderX.   

Thanks,
Tirumalesh. 


> Thanks,
> 
> 	M.
> -- 
> Jazz is not dead. It just smells funny...
Marc Zyngier May 25, 2015, 10:38 a.m. UTC | #5
On Fri, 22 May 2015 23:57:40 +0100
"Chalamarla, Tirumalesh" <Tirumalesh.Chalamarla@caviumnetworks.com>
wrote:

Hi Tirumalesh,

> 
> > On May 22, 2015, at 1:26 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> > 
> > On 20/05/15 13:48, Robert Richter wrote:
> >> Mark,
> >> 
> >> thanks for review, also of the other patches of this series.
> >> 
> >> See below
> >> 
> >> On 20.05.15 13:11:38, Marc Zyngier wrote:
> >>>> -	dev_alias->dev_id = alias;
> >>>> +	dev_alias->dev_id = (pci_domain_nr(pdev->bus) << 16) | alias;
> >> 
> >>> This feels very scary. We're now assuming that the domain number will
> >>> always be presented to the doorbell. What guarantee do we have that
> >>> this is always the case, irrespective of the platform?
> >>> 
> >>> Also, domains have no PCI reality, they are a Linux thing. And they can
> >>> be "randomly" assigned, unless you force the domain in DT with a
> >>> linux,pci-domain property. This looks even more wrong, specially
> >>> considering ACPI.
> >> 
> >> The main problem here is that device ids (32 bits) are system
> >> specific. Since we have more than one PCI root complex we need the
> >> upper 16 bits in the devid for mapping. Using pci_domain_nr for this
> >> fits our needs for now and shouldn't affect systems with a single RC
> >> only as the domain nr is zero then.
> >> 
> >> The domain number is incremented during initialization beginnig with
> >> zero and the order of it is fixed since it is taken from DT or ACPI
> >> tables. So we have full controll of it. I don't see issues here.
> > 
> > This may match what you have on ThunderX (as long as the kernel doesn't
> > adopt another behaviour when allocating the domain number). But other
> > platforms may have a completely different numbering, which will mess
> > them up entirely.
> > 
> >>> It really feels like we need a way to describe how the BDF numbering is
> >>> augmented. We also need to guarantee that we get the actual bridge
> >>> number, as opposed to the domain number.
> >> 
> >> But true, the obove is just intermediate. In the end we need some sort
> >> of handler that is setup during cpu initialization that registers a
> >> callback for the gic to determine the device id of that paricular
> >> system.
> > 
> > I don't really like the idea of a callback from the GIC - I'd prefer it
> > to be standalone, and rely on the topology information to build the
> > DeviceID. Mark Rutland had some ideas for DT (he posted an RFC a while
> > ago), maybe it would be good to get back to that and find out what we
> > can do. ACPI should also have similar information (IORT?).
> > 
> 
> How can some one pass this from DT, especially in GIC entry. i still
> think it is bus owner responsibility and call back is better idea.
> but if some one has a better idea for DT and ACPI, we are fine as
> long as it works on ThunderX.   

A callback would have to be bus-specific, and depends from the observer
of the access. There is strictly no guarantee that a single write from
the device is performed using the same ID to the IOMMU and to the MSI
doorbell. Actually, they are very likely to be different. A generic
callback would have to know about the point where this access is
observed, and expressing this is a nightmare.

Also, I'm really opposed to having platform-specific code that has for
sole purpose to describe the hardware. This is why we have DT (and to a
lesser extent ACPI). We've been there on 32bit, and learned our lesson.
It doesn't scale, it leads to a bunch of hacks in all corners, and I
don't feel like being on the receiving end of something like this.

I really suggest you look at Mark's suggestion:


http://lists.infradead.org/pipermail/linux-arm-kernel/2015-March/333199.html
http://lists.infradead.org/pipermail/linux-arm-kernel/2015-May/341584.html

because so far, this is the only proposal that makes any sense to me in
the long run. Feel free to comment on it and help us making something
that also work for your favorite SoC.

Thanks,

        M.
diff mbox

Patch

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 9687f8afebff..e30b4de04c6c 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1186,7 +1186,7 @@  static int its_get_pci_alias(struct pci_dev *pdev, u16 alias, void *data)
 {
 	struct its_pci_alias *dev_alias = data;
 
-	dev_alias->dev_id = alias;
+	dev_alias->dev_id = (pci_domain_nr(pdev->bus) << 16) | alias;
 	if (pdev != dev_alias->pdev)
 		dev_alias->count += its_pci_msi_vec_count(dev_alias->pdev);