diff mbox

[Part2,v5,21/31] PCI/MSI: Enhance core to support hierarchy irqdomain

Message ID 1415283644-2559-22-git-send-email-jiang.liu@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jiang Liu Nov. 6, 2014, 2:20 p.m. UTC
Enhance PCI MSI core to support hierarchy irqdomain, so the common
code could be shared among architectures.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
Hi Thomas,
	These changes are a temporary solution, I'm working on another
patch set which will refine these interfaces, especially kill 
arch_msi_irq_domain_{set|get}_hwirq().
Regards!
Gerry
---
 drivers/pci/Kconfig |    4 ++
 drivers/pci/msi.c   |  134 +++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/msi.h |   14 ++++++
 3 files changed, 152 insertions(+)

Comments

Thomas Gleixner Nov. 7, 2014, 10:40 a.m. UTC | #1
On Thu, 6 Nov 2014, Jiang Liu wrote:

> Enhance PCI MSI core to support hierarchy irqdomain, so the common
> code could be shared among architectures.
> 
> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
> ---
> Hi Thomas,
> 	These changes are a temporary solution, I'm working on another
> patch set which will refine these interfaces, especially kill 
> arch_msi_irq_domain_{set|get}_hwirq().

Ok. Please don't forget the irq_chip change :)

Thanks,

	tglx
Suravee Suthikulpanit Nov. 9, 2014, 4:26 a.m. UTC | #2
Hi Gerry,

Please see my comments / questions below.

On 11/6/14 21:20, Jiang Liu wrote:
> Enhance PCI MSI core to support hierarchy irqdomain, so the common
> code could be shared among architectures.
>
> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
> ---
> Hi Thomas,
> 	These changes are a temporary solution, I'm working on another
> patch set which will refine these interfaces, especially kill
> arch_msi_irq_domain_{set|get}_hwirq().
> Regards!
> Gerry

Would the change includes the struct irqdomain_msi_data proposed by 
Thomas here (https://lkml.org/lkml/2014/11/6/210)?

> [...]
> +static int msi_domain_alloc(struct irq_domain *domain, unsigned int virq,
> +			    unsigned int nr_irqs, void *arg)
> +{
> +	int i, ret;
> +	irq_hw_number_t hwirq = arch_msi_irq_domain_get_hwirq(arg);
> +
> +	if (irq_find_mapping(domain, hwirq) > 0)
> +		return -EEXIST;
> +
> +	ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, arg);
> +	if (ret < 0)
> +		return ret;
> +

When executing irq_domain_alloc_irqs_parent(), it triggers the following 
warning message due to

	WARN_ON(desc->irq_data.chip == &no_irq_chip)

------------[ cut here ]------------
WARNING: CPU: 2 PID: 912 at kernel/irq/chip.c:734 
__irq_set_handler+0x138/0x13c()
Modules linked in: mlx4_core(+) rtc_efi efivarfs
CPU: 2 PID: 912 Comm: modprobe Not tainted 3.18.0-rc3-p2v5+ #53
Call trace:
[<fffffe00000963d4>] dump_backtrace+0x0/0x16c
[<fffffe0000096550>] show_stack+0x10/0x1c
[<fffffe000067545c>] dump_stack+0x74/0x98
[<fffffe00000b205c>] warn_slowpath_common+0x84/0xac
[<fffffe00000b2148>] warn_slowpath_null+0x14/0x20
[<fffffe00000ed968>] __irq_set_handler+0x134/0x13c
[<fffffe000042af14>] gic_irq_domain_map+0x4c/0xac
[<fffffe000042afd4>] gic_irq_domain_alloc+0x60/0x88
[<fffffe000042b374>] gicv2m_domain_alloc+0x30/0xa8
[<fffffe00000efc1c>] __irq_domain_alloc_irqs+0x144/0x30c
[<fffffe000042b58c>] gicv2m_setup_msi_irq+0xc0/0x118
[<fffffe000044548c>] arch_setup_msi_irq+0x34/0x60
[<fffffe0000445544>] arch_setup_msi_irqs+0x50/0xb0
[<fffffe0000445e3c>] pci_enable_msix+0x310/0x39c
[<fffffe0000445efc>] pci_enable_msix_range+0x34/0x9c
....

However, I think if we call irq_domain_set_hwirq_and_chip() in the 
msi_create_irq_domain() as suggested by Thomas, that should also fix 
this issue.

> +	for (i = 0; i < nr_irqs; i++) {
> +		irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i,
> +					      domain->host_data,
> +					      (void *)(long)i);
> +		__irq_set_handler(virq + i, handle_edge_irq, 0, "edge");
> +	}

Is there are a way to specify other type of handler besides edge?

> [...]
> +static void msi_domain_activate(struct irq_domain *domain,
> +				struct irq_data *irq_data)
> +{
> +	struct msi_msg msg;
> +
> +	/*
> +	 * irq_data->chip_data is MSI/MSI-X offset.
> +	 * MSI-X message is written per-IRQ, the offset is always 0.
> +	 * MSI message denotes a contiguous group of IRQs, written for 0th IRQ.
> +	 */
> +	if (irq_data->chip_data)
> +		return;

Actually, I am a bit confused with this comment here.  If you look at 
"/drivers/pci/msi.c: arch_setup_msi_irq()", it calls 
irq_set_chip_data(des->irq, chip) where chip is *msi_chip.

It looks like if the arch uses this API, it would conflict with what you 
have here where the irq_data->chip_data would be not NULL at the logic 
above, and would always return.

Thank you,

Suravee
Jiang Liu Nov. 9, 2014, 7:15 a.m. UTC | #3
On 2014/11/9 12:26, Suravee Suthikulpanit wrote:
> Hi Gerry,
> 
> Please see my comments / questions below.
> 
> On 11/6/14 21:20, Jiang Liu wrote:
>> Enhance PCI MSI core to support hierarchy irqdomain, so the common
>> code could be shared among architectures.
>>
>> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
>> ---
>> Hi Thomas,
>>     These changes are a temporary solution, I'm working on another
>> patch set which will refine these interfaces, especially kill
>> arch_msi_irq_domain_{set|get}_hwirq().
>> Regards!
>> Gerry
> 
> Would the change includes the struct irqdomain_msi_data proposed by
> Thomas here (https://lkml.org/lkml/2014/11/6/210)?
Hi Suravee,
	I'm working on another patch set which will refine the way to
create irqdomain for MSI. I hope will solve all issues mentioned by
Thomas. I will post that patch set as RFC within one or two days.

> 
>> [...]
>> +static int msi_domain_alloc(struct irq_domain *domain, unsigned int
>> virq,
>> +                unsigned int nr_irqs, void *arg)
>> +{
>> +    int i, ret;
>> +    irq_hw_number_t hwirq = arch_msi_irq_domain_get_hwirq(arg);
>> +
>> +    if (irq_find_mapping(domain, hwirq) > 0)
>> +        return -EEXIST;
>> +
>> +    ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, arg);
>> +    if (ret < 0)
>> +        return ret;
>> +
> 
> When executing irq_domain_alloc_irqs_parent(), it triggers the following
> warning message due to
> 
>     WARN_ON(desc->irq_data.chip == &no_irq_chip)
> 
> ------------[ cut here ]------------
> WARNING: CPU: 2 PID: 912 at kernel/irq/chip.c:734
> __irq_set_handler+0x138/0x13c()
> Modules linked in: mlx4_core(+) rtc_efi efivarfs
> CPU: 2 PID: 912 Comm: modprobe Not tainted 3.18.0-rc3-p2v5+ #53
> Call trace:
> [<fffffe00000963d4>] dump_backtrace+0x0/0x16c
> [<fffffe0000096550>] show_stack+0x10/0x1c
> [<fffffe000067545c>] dump_stack+0x74/0x98
> [<fffffe00000b205c>] warn_slowpath_common+0x84/0xac
> [<fffffe00000b2148>] warn_slowpath_null+0x14/0x20
> [<fffffe00000ed968>] __irq_set_handler+0x134/0x13c
> [<fffffe000042af14>] gic_irq_domain_map+0x4c/0xac
> [<fffffe000042afd4>] gic_irq_domain_alloc+0x60/0x88
> [<fffffe000042b374>] gicv2m_domain_alloc+0x30/0xa8
> [<fffffe00000efc1c>] __irq_domain_alloc_irqs+0x144/0x30c
> [<fffffe000042b58c>] gicv2m_setup_msi_irq+0xc0/0x118
> [<fffffe000044548c>] arch_setup_msi_irq+0x34/0x60
> [<fffffe0000445544>] arch_setup_msi_irqs+0x50/0xb0
> [<fffffe0000445e3c>] pci_enable_msix+0x310/0x39c
> [<fffffe0000445efc>] pci_enable_msix_range+0x34/0x9c
> ....
> 
> However, I think if we call irq_domain_set_hwirq_and_chip() in the
> msi_create_irq_domain() as suggested by Thomas, that should also fix
> this issue.
We didn't trigger this warning on x86 because only the outer-most
irqdomain will set irq flow handler. On ARM, the inner GIC domain
will set irq flow handler, but the irq_data->chip for outer-most
domain hasn't been set yet.

I'm still find a way out here. Maybe we need to relax the WARN_ON().

> 
>> +    for (i = 0; i < nr_irqs; i++) {
>> +        irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i,
>> +                          domain->host_data,
>> +                          (void *)(long)i);
>> +        __irq_set_handler(virq + i, handle_edge_irq, 0, "edge");
>> +    }
> 
> Is there are a way to specify other type of handler besides edge?
Yes, the new patch set will address this requirement.

> 
>> [...]
>> +static void msi_domain_activate(struct irq_domain *domain,
>> +                struct irq_data *irq_data)
>> +{
>> +    struct msi_msg msg;
>> +
>> +    /*
>> +     * irq_data->chip_data is MSI/MSI-X offset.
>> +     * MSI-X message is written per-IRQ, the offset is always 0.
>> +     * MSI message denotes a contiguous group of IRQs, written for
>> 0th IRQ.
>> +     */
>> +    if (irq_data->chip_data)
>> +        return;
> 
> Actually, I am a bit confused with this comment here.  If you look at
> "/drivers/pci/msi.c: arch_setup_msi_irq()", it calls
> irq_set_chip_data(des->irq, chip) where chip is *msi_chip.
> 
> It looks like if the arch uses this API, it would conflict with what you
> have here where the irq_data->chip_data would be not NULL at the logic
> above, and would always return.
Good point, it's work on x86, but will break ARM or other platforms.
Will refine this part.
Regards!
Gerry
> 
> Thank you,
> 
> Suravee
Suravee Suthikulpanit Nov. 10, 2014, 2:03 a.m. UTC | #4
On 11/9/14 14:15, Jiang Liu wrote:
> On 2014/11/9 12:26, Suravee Suthikulpanit wrote:
>> Hi Gerry,
>>
>> Please see my comments / questions below.
>>
>> On 11/6/14 21:20, Jiang Liu wrote:
>>> Enhance PCI MSI core to support hierarchy irqdomain, so the common
>>> code could be shared among architectures.
>>>
>>> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
>>> ---
>>> Hi Thomas,
>>>      These changes are a temporary solution, I'm working on another
>>> patch set which will refine these interfaces, especially kill
>>> arch_msi_irq_domain_{set|get}_hwirq().
>>> Regards!
>>> Gerry
>>
>> Would the change includes the struct irqdomain_msi_data proposed by
>> Thomas here (https://lkml.org/lkml/2014/11/6/210)?
> Hi Suravee,
> 	I'm working on another patch set which will refine the way to
> create irqdomain for MSI. I hope will solve all issues mentioned by
> Thomas. I will post that patch set as RFC within one or two days.

Let me know if I can help.

>>
>>> [...]
>>> +static int msi_domain_alloc(struct irq_domain *domain, unsigned int
>>> virq,
>>> +                unsigned int nr_irqs, void *arg)
>>> +{
>>> +    int i, ret;
>>> +    irq_hw_number_t hwirq = arch_msi_irq_domain_get_hwirq(arg);
>>> +
>>> +    if (irq_find_mapping(domain, hwirq) > 0)
>>> +        return -EEXIST;
>>> +
>>> +    ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, arg);
>>> +    if (ret < 0)
>>> +        return ret;
>>> +
>>
>> When executing irq_domain_alloc_irqs_parent(), it triggers the following
>> warning message due to
>>
>>      WARN_ON(desc->irq_data.chip == &no_irq_chip)
>>
>> ------------[ cut here ]------------
>> WARNING: CPU: 2 PID: 912 at kernel/irq/chip.c:734
>> __irq_set_handler+0x138/0x13c()
>> Modules linked in: mlx4_core(+) rtc_efi efivarfs
>> CPU: 2 PID: 912 Comm: modprobe Not tainted 3.18.0-rc3-p2v5+ #53
>> Call trace:
>> [<fffffe00000963d4>] dump_backtrace+0x0/0x16c
>> [<fffffe0000096550>] show_stack+0x10/0x1c
>> [<fffffe000067545c>] dump_stack+0x74/0x98
>> [<fffffe00000b205c>] warn_slowpath_common+0x84/0xac
>> [<fffffe00000b2148>] warn_slowpath_null+0x14/0x20
>> [<fffffe00000ed968>] __irq_set_handler+0x134/0x13c
>> [<fffffe000042af14>] gic_irq_domain_map+0x4c/0xac
>> [<fffffe000042afd4>] gic_irq_domain_alloc+0x60/0x88
>> [<fffffe000042b374>] gicv2m_domain_alloc+0x30/0xa8
>> [<fffffe00000efc1c>] __irq_domain_alloc_irqs+0x144/0x30c
>> [<fffffe000042b58c>] gicv2m_setup_msi_irq+0xc0/0x118
>> [<fffffe000044548c>] arch_setup_msi_irq+0x34/0x60
>> [<fffffe0000445544>] arch_setup_msi_irqs+0x50/0xb0
>> [<fffffe0000445e3c>] pci_enable_msix+0x310/0x39c
>> [<fffffe0000445efc>] pci_enable_msix_range+0x34/0x9c
>> ....
>>
>> However, I think if we call irq_domain_set_hwirq_and_chip() in the
>> msi_create_irq_domain() as suggested by Thomas, that should also fix
>> this issue.
> We didn't trigger this warning on x86 because only the outer-most
> irqdomain will set irq flow handler. On ARM, the inner GIC domain
> will set irq flow handler, but the irq_data->chip for outer-most
> domain hasn't been set yet.
>
> I'm still find a way out here. Maybe we need to relax the WARN_ON().

What if we just switch the order and call the 
irq_domain_alloc_irqs_parent() after the loop? That way we can still 
keep the WARN_ON().

Suravee
diff mbox

Patch

diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index b9db0f2ce11f..022e89745f86 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -16,6 +16,10 @@  config PCI_MSI
 
 	   If you don't know what to do here, say Y.
 
+config PCI_MSI_IRQ_DOMAIN
+	bool
+	depends on PCI_MSI && IRQ_DOMAIN_HIERARCHY
+
 config PCI_DEBUG
 	bool "PCI Debugging"
 	depends on PCI && DEBUG_KERNEL
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index da181c59394b..8de7c8774fd2 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -19,6 +19,7 @@ 
 #include <linux/errno.h>
 #include <linux/io.h>
 #include <linux/slab.h>
+#include <linux/irqdomain.h>
 
 #include "pci.h"
 
@@ -1098,3 +1099,136 @@  int pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries,
 	return nvec;
 }
 EXPORT_SYMBOL(pci_enable_msix_range);
+
+#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
+/*
+ * Generate a unique ID number for each possible MSI source, the ID number
+ * is only used within the irqdomain.
+ */
+static inline irq_hw_number_t
+msi_get_hwirq(struct pci_dev *dev, struct msi_desc *desc)
+{
+	return (irq_hw_number_t)desc->msi_attrib.entry_nr |
+		PCI_DEVID(dev->bus->number, dev->devfn) << 11 |
+		(pci_domain_nr(dev->bus) & 0xFFFFFFFF) << 27;
+}
+
+static int msi_domain_alloc(struct irq_domain *domain, unsigned int virq,
+			    unsigned int nr_irqs, void *arg)
+{
+	int i, ret;
+	irq_hw_number_t hwirq = arch_msi_irq_domain_get_hwirq(arg);
+
+	if (irq_find_mapping(domain, hwirq) > 0)
+		return -EEXIST;
+
+	ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, arg);
+	if (ret < 0)
+		return ret;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i,
+					      domain->host_data,
+					      (void *)(long)i);
+		__irq_set_handler(virq + i, handle_edge_irq, 0, "edge");
+	}
+
+	return ret;
+}
+
+static void msi_domain_free(struct irq_domain *domain, unsigned int virq,
+			    unsigned int nr_irqs)
+{
+	int i;
+
+	for (i = 0; i < nr_irqs; i++) {
+		struct msi_desc *desc = irq_get_msi_desc(virq);
+
+		if (desc)
+			desc->irq = 0;
+	}
+	irq_domain_free_irqs_top(domain, virq, nr_irqs);
+}
+
+static void msi_domain_activate(struct irq_domain *domain,
+				struct irq_data *irq_data)
+{
+	struct msi_msg msg;
+
+	/*
+	 * irq_data->chip_data is MSI/MSI-X offset.
+	 * MSI-X message is written per-IRQ, the offset is always 0.
+	 * MSI message denotes a contiguous group of IRQs, written for 0th IRQ.
+	 */
+	if (irq_data->chip_data)
+		return;
+
+	BUG_ON(irq_chip_compose_msi_msg(irq_data, &msg));
+	__write_msi_msg(irq_data->msi_desc, &msg);
+}
+
+static void msi_domain_deactivate(struct irq_domain *domain,
+				  struct irq_data *irq_data)
+{
+	struct msi_msg msg;
+
+	if (!irq_data->chip_data) {
+		memset(&msg, 0, sizeof(msg));
+		__write_msi_msg(irq_data->msi_desc, &msg);
+	}
+}
+
+static struct irq_domain_ops msi_domain_ops = {
+	.alloc = msi_domain_alloc,
+	.free = msi_domain_free,
+	.activate = msi_domain_activate,
+	.deactivate = msi_domain_deactivate,
+};
+
+struct irq_domain *msi_create_irq_domain(struct device_node *of_node,
+					 struct irq_chip *chip,
+					 struct irq_domain *parent)
+{
+	struct irq_domain *domain;
+
+	domain = irq_domain_add_tree(of_node, &msi_domain_ops, chip);
+	if (!domain)
+		return NULL;
+
+	domain->parent = parent;
+
+	return domain;
+}
+
+int msi_irq_domain_alloc_irqs(struct irq_domain *domain, int type,
+			      struct pci_dev *dev, void *arg)
+{
+	int i, virq;
+	struct msi_desc *desc;
+	int node = dev_to_node(&dev->dev);
+
+	list_for_each_entry(desc, &dev->msi_list, list) {
+		arch_msi_irq_domain_set_hwirq(arg, msi_get_hwirq(dev, desc));
+		virq = irq_domain_alloc_irqs(domain, desc->nvec_used,
+					     node, arg);
+		if (virq < 0) {
+			/* Special handling for pci_enable_msi_range(). */
+			if (type == PCI_CAP_ID_MSI && desc->nvec_used > 1)
+				return 1;
+			else
+				return -ENOSPC;
+		}
+		for (i = 0; i < desc->nvec_used; i++)
+			irq_set_msi_desc_off(virq, i, desc);
+	}
+
+	list_for_each_entry(desc, &dev->msi_list, list)
+		if (desc->nvec_used == 1)
+			dev_dbg(&dev->dev, "irq %d for MSI/MSI-X\n", virq);
+		else
+			dev_dbg(&dev->dev, "irq [%d-%d] for MSI/MSI-X\n",
+				virq, virq + desc->nvec_used - 1);
+
+	return 0;
+}
+#endif /* CONFIG_PCI_MSI_IRQ_DOMAIN */
diff --git a/include/linux/msi.h b/include/linux/msi.h
index 44f4746d033b..662c628fc2fa 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -75,4 +75,18 @@  struct msi_chip {
 	void (*teardown_irq)(struct msi_chip *chip, unsigned int irq);
 };
 
+#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
+struct irq_domain;
+struct irq_chip;
+
+struct irq_domain *msi_create_irq_domain(struct device_node *of_node,
+					 struct irq_chip *chip,
+					 struct irq_domain *parent);
+int msi_irq_domain_alloc_irqs(struct irq_domain *domain, int type,
+			      struct pci_dev *dev, void *arg);
+
+irq_hw_number_t arch_msi_irq_domain_get_hwirq(void *arg);
+void arch_msi_irq_domain_set_hwirq(void *arg, irq_hw_number_t hwirq);
+#endif /* CONFIG_PCI_MSI_IRQ_DOMAIN */
+
 #endif /* LINUX_MSI_H */