diff mbox series

[v12,14/25] irqchip/sifive-plic: Convert PLIC driver into a platform driver

Message ID 20240127161753.114685-15-apatel@ventanamicro.com (mailing list archive)
State New, archived
Headers show
Series Linux RISC-V AIA Support | expand

Commit Message

Anup Patel Jan. 27, 2024, 4:17 p.m. UTC
The PLIC driver does not require very early initialization so let
us convert it into a platform driver.

As part of the conversion, the PLIC probing undergoes the following
changes:
1. Use dev_info(), dev_err() and dev_warn() instead of pr_info(),
   pr_err() and pr_warn()
2. Use devm_xyz() APIs wherever applicable
3. PLIC is now probed after CPUs are brought-up so we have to
   setup cpuhp state after context handler of all online CPUs
   are initialized otherwise we see crash on multi-socket systems

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
---
 drivers/irqchip/irq-sifive-plic.c | 239 ++++++++++++++++++------------
 1 file changed, 148 insertions(+), 91 deletions(-)

Comments

Thomas Gleixner Feb. 16, 2024, 3:33 p.m. UTC | #1
On Sat, Jan 27 2024 at 21:47, Anup Patel wrote:
> +	priv->irqdomain = irq_domain_create_linear(dev->fwnode, nr_irqs + 1,
> +						   &plic_irqdomain_ops, priv);
> +	if (WARN_ON(!priv->irqdomain))
> +		return -ENOMEM;

While some of the stuff is cleaned up by devm, the error handling in
this code looks pretty fragile as it leaves initialized contexts,
hardware state, chained handlers etc. around.

The question is whether the system can actually boot or work at all if
any of this fails.

> +
>  	/*
>  	 * We can have multiple PLIC instances so setup cpuhp state
> -	 * and register syscore operations only when context handler
> -	 * for current/boot CPU is present.
> +	 * and register syscore operations only after context handlers
> +	 * of all online CPUs are initialized.
>  	 */
> -	handler = this_cpu_ptr(&plic_handlers);
> -	if (handler->present && !plic_cpuhp_setup_done) {
> +	cpuhp_setup = true;
> +	for_each_online_cpu(cpu) {
> +		handler = per_cpu_ptr(&plic_handlers, cpu);
> +		if (!handler->present) {
> +			cpuhp_setup = false;
> +			break;
> +		}
> +	}
> +	if (cpuhp_setup) {
>  		cpuhp_setup_state(CPUHP_AP_IRQ_SIFIVE_PLIC_STARTING,
>  				  "irqchip/sifive/plic:starting",
>  				  plic_starting_cpu, plic_dying_cpu);
>  		register_syscore_ops(&plic_irq_syscore_ops);
> -		plic_cpuhp_setup_done = true;

I don't think that removing the setup protection is correct.

Assume you have maxcpus=N on the kernel command line, then the above
for_each_online_cpu() loop would result in cpuhp_setup == true when the
instances for the not onlined CPUs are set up, no?

Thanks,

        tglx
Anup Patel Feb. 16, 2024, 5:11 p.m. UTC | #2
On Fri, Feb 16, 2024 at 9:03 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Sat, Jan 27 2024 at 21:47, Anup Patel wrote:
> > +     priv->irqdomain = irq_domain_create_linear(dev->fwnode, nr_irqs + 1,
> > +                                                &plic_irqdomain_ops, priv);
> > +     if (WARN_ON(!priv->irqdomain))
> > +             return -ENOMEM;
>
> While some of the stuff is cleaned up by devm, the error handling in
> this code looks pretty fragile as it leaves initialized contexts,
> hardware state, chained handlers etc. around.

Sure, let me try to improve the error handling.

>
> The question is whether the system can actually boot or work at all if
> any of this fails.

On platforms with PLIC, the PLIC only manages wired interrupts
whereas IPIs are provided through SBI (firmware interface) so a
system can actually continue and boot further without PLIC.

In fact, we do have a synthetic platform (namely QEMU spike)
where there is no PLIC instance and Linux boots using SBI based
polling console.

>
> > +
> >       /*
> >        * We can have multiple PLIC instances so setup cpuhp state
> > -      * and register syscore operations only when context handler
> > -      * for current/boot CPU is present.
> > +      * and register syscore operations only after context handlers
> > +      * of all online CPUs are initialized.
> >        */
> > -     handler = this_cpu_ptr(&plic_handlers);
> > -     if (handler->present && !plic_cpuhp_setup_done) {
> > +     cpuhp_setup = true;
> > +     for_each_online_cpu(cpu) {
> > +             handler = per_cpu_ptr(&plic_handlers, cpu);
> > +             if (!handler->present) {
> > +                     cpuhp_setup = false;
> > +                     break;
> > +             }
> > +     }
> > +     if (cpuhp_setup) {
> >               cpuhp_setup_state(CPUHP_AP_IRQ_SIFIVE_PLIC_STARTING,
> >                                 "irqchip/sifive/plic:starting",
> >                                 plic_starting_cpu, plic_dying_cpu);
> >               register_syscore_ops(&plic_irq_syscore_ops);
> > -             plic_cpuhp_setup_done = true;
>
> I don't think that removing the setup protection is correct.
>
> Assume you have maxcpus=N on the kernel command line, then the above
> for_each_online_cpu() loop would result in cpuhp_setup == true when the
> instances for the not onlined CPUs are set up, no?

A platform can have multiple PLIC instances where each PLIC
instance targets a subset of HARTs (or CPUs).

Previously (before this patch), we were probing PLIC very early so on
a platform with multiple PLIC instances, we need to ensure that cpuhp
setup is done only after PLIC context associated with boot CPU is
initialized hence the plic_cpuhp_setup_done check.

This patch converts PLIC driver into a platform driver so now PLIC
instances are probed after all available CPUs are brought-up. In this
case, the cpuhp setup must be done only after PLIC context of all
available CPUs are initialized otherwise some of the CPUs crash
in plic_starting_cpu() due to lack of PLIC context initialization.

Regards,
Anup
Thomas Gleixner Feb. 16, 2024, 8:22 p.m. UTC | #3
On Fri, Feb 16 2024 at 22:41, Anup Patel wrote:
> On Fri, Feb 16, 2024 at 9:03 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>> I don't think that removing the setup protection is correct.
>>
>> Assume you have maxcpus=N on the kernel command line, then the above
>> for_each_online_cpu() loop would result in cpuhp_setup == true when the
>> instances for the not onlined CPUs are set up, no?
>
> A platform can have multiple PLIC instances where each PLIC
> instance targets a subset of HARTs (or CPUs).
>
> Previously (before this patch), we were probing PLIC very early so on
> a platform with multiple PLIC instances, we need to ensure that cpuhp
> setup is done only after PLIC context associated with boot CPU is
> initialized hence the plic_cpuhp_setup_done check.
>
> This patch converts PLIC driver into a platform driver so now PLIC
> instances are probed after all available CPUs are brought-up. In this
> case, the cpuhp setup must be done only after PLIC context of all
> available CPUs are initialized otherwise some of the CPUs crash
> in plic_starting_cpu() due to lack of PLIC context initialization.

You're missing the point.

Assume you have 8 CPUs and 2 PLIC instances one for CPU0-3 and one for
CPU4-7.

Add "maxcpus=4" on the kernel command line, then only the first 4 CPUs
are brought up.

So at probe time cpu_online_mask has bit 0,1,2,3 set.

When the first PLIC it probed the loop which checks the context for each
online CPU will not clear cpuhp_setup and the hotplug state is installed.

Now the second PLIC is probed (the one for the offline CPUs 4-7) and the
loop will again not clear cpuhp_setup and it tries to install the state
again, no?

Thanks,

        tglx
Anup Patel Feb. 17, 2024, 5:42 a.m. UTC | #4
On Sat, Feb 17, 2024 at 1:52 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Fri, Feb 16 2024 at 22:41, Anup Patel wrote:
> > On Fri, Feb 16, 2024 at 9:03 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> >> I don't think that removing the setup protection is correct.
> >>
> >> Assume you have maxcpus=N on the kernel command line, then the above
> >> for_each_online_cpu() loop would result in cpuhp_setup == true when the
> >> instances for the not onlined CPUs are set up, no?
> >
> > A platform can have multiple PLIC instances where each PLIC
> > instance targets a subset of HARTs (or CPUs).
> >
> > Previously (before this patch), we were probing PLIC very early so on
> > a platform with multiple PLIC instances, we need to ensure that cpuhp
> > setup is done only after PLIC context associated with boot CPU is
> > initialized hence the plic_cpuhp_setup_done check.
> >
> > This patch converts PLIC driver into a platform driver so now PLIC
> > instances are probed after all available CPUs are brought-up. In this
> > case, the cpuhp setup must be done only after PLIC context of all
> > available CPUs are initialized otherwise some of the CPUs crash
> > in plic_starting_cpu() due to lack of PLIC context initialization.
>
> You're missing the point.
>
> Assume you have 8 CPUs and 2 PLIC instances one for CPU0-3 and one for
> CPU4-7.
>
> Add "maxcpus=4" on the kernel command line, then only the first 4 CPUs
> are brought up.
>
> So at probe time cpu_online_mask has bit 0,1,2,3 set.
>
> When the first PLIC it probed the loop which checks the context for each
> online CPU will not clear cpuhp_setup and the hotplug state is installed.
>
> Now the second PLIC is probed (the one for the offline CPUs 4-7) and the
> loop will again not clear cpuhp_setup and it tries to install the state
> again, no?

Ahh, yes. Good catch.

For the "maxcpus" in kernel command-line, we can't rely on the
cpu_online_mask. I will preserve the plic_cpuhp_setup_done
check in the next revision.

Regards,
Anup
diff mbox series

Patch

diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index 5b7bc4fd9517..c8f8a8cdcce1 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -3,7 +3,6 @@ 
  * Copyright (C) 2017 SiFive
  * Copyright (C) 2018 Christoph Hellwig
  */
-#define pr_fmt(fmt) "plic: " fmt
 #include <linux/cpu.h>
 #include <linux/interrupt.h>
 #include <linux/io.h>
@@ -64,6 +63,7 @@ 
 #define PLIC_QUIRK_EDGE_INTERRUPT	0
 
 struct plic_priv {
+	struct device *dev;
 	struct cpumask lmask;
 	struct irq_domain *irqdomain;
 	void __iomem *regs;
@@ -85,7 +85,6 @@  struct plic_handler {
 	struct plic_priv	*priv;
 };
 static int plic_parent_irq __ro_after_init;
-static bool plic_cpuhp_setup_done __ro_after_init;
 static DEFINE_PER_CPU(struct plic_handler, plic_handlers);
 
 static int plic_irq_set_type(struct irq_data *d, unsigned int type);
@@ -371,7 +370,8 @@  static void plic_handle_irq(struct irq_desc *desc)
 		int err = generic_handle_domain_irq(handler->priv->irqdomain,
 						    hwirq);
 		if (unlikely(err))
-			pr_warn_ratelimited("can't find mapping for hwirq %lu\n",
+			dev_warn_ratelimited(handler->priv->dev,
+					"can't find mapping for hwirq %lu\n",
 					hwirq);
 	}
 
@@ -406,57 +406,126 @@  static int plic_starting_cpu(unsigned int cpu)
 	return 0;
 }
 
-static int __init __plic_init(struct device_node *node,
-			      struct device_node *parent,
-			      unsigned long plic_quirks)
+static const struct of_device_id plic_match[] = {
+	{ .compatible = "sifive,plic-1.0.0" },
+	{ .compatible = "riscv,plic0" },
+	{ .compatible = "andestech,nceplic100",
+	  .data = (const void *)BIT(PLIC_QUIRK_EDGE_INTERRUPT) },
+	{ .compatible = "thead,c900-plic",
+	  .data = (const void *)BIT(PLIC_QUIRK_EDGE_INTERRUPT) },
+	{}
+};
+
+static int plic_parse_nr_irqs_and_contexts(struct platform_device *pdev,
+					   u32 *nr_irqs, u32 *nr_contexts)
 {
-	int error = 0, nr_contexts, nr_handlers = 0, i;
-	u32 nr_irqs;
-	struct plic_priv *priv;
+	struct device *dev = &pdev->dev;
+	int rc;
+
+	/*
+	 * Currently, only OF fwnode is supported so extend this
+	 * function for ACPI support.
+	 */
+	if (!is_of_node(dev->fwnode))
+		return -EINVAL;
+
+	rc = of_property_read_u32(to_of_node(dev->fwnode),
+				  "riscv,ndev", nr_irqs);
+	if (rc) {
+		dev_err(dev, "riscv,ndev property not available\n");
+		return rc;
+	}
+
+	*nr_contexts = of_irq_count(to_of_node(dev->fwnode));
+	if (WARN_ON(!(*nr_contexts))) {
+		dev_err(dev, "no PLIC context available\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int plic_parse_context_parent_hwirq(struct platform_device *pdev,
+					   u32 context, u32 *parent_hwirq,
+					   unsigned long *parent_hartid)
+{
+	struct device *dev = &pdev->dev;
+	struct of_phandle_args parent;
+	int rc;
+
+	/*
+	 * Currently, only OF fwnode is supported so extend this
+	 * function for ACPI support.
+	 */
+	if (!is_of_node(dev->fwnode))
+		return -EINVAL;
+
+	rc = of_irq_parse_one(to_of_node(dev->fwnode), context, &parent);
+	if (rc)
+		return rc;
+
+	rc = riscv_of_parent_hartid(parent.np, parent_hartid);
+	if (rc)
+		return rc;
+
+	*parent_hwirq = parent.args[0];
+	return 0;
+}
+
+static int plic_probe(struct platform_device *pdev)
+{
+	int rc, nr_contexts, nr_handlers = 0, i, cpu;
+	unsigned long plic_quirks = 0, hartid;
+	struct device *dev = &pdev->dev;
 	struct plic_handler *handler;
-	unsigned int cpu;
+	u32 nr_irqs, parent_hwirq;
+	struct irq_domain *domain;
+	struct plic_priv *priv;
+	irq_hw_number_t hwirq;
+	struct resource *res;
+	bool cpuhp_setup;
+
+	if (is_of_node(dev->fwnode)) {
+		const struct of_device_id *id;
+
+		id = of_match_node(plic_match, to_of_node(dev->fwnode));
+		if (id)
+			plic_quirks = (unsigned long)id->data;
+	}
 
-	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
 	if (!priv)
 		return -ENOMEM;
-
+	priv->dev = dev;
 	priv->plic_quirks = plic_quirks;
 
-	priv->regs = of_iomap(node, 0);
-	if (WARN_ON(!priv->regs)) {
-		error = -EIO;
-		goto out_free_priv;
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (!res) {
+		dev_err(dev, "failed to get MMIO resource\n");
+		return -EINVAL;
+	}
+	priv->regs = devm_ioremap(dev, res->start, resource_size(res));
+	if (!priv->regs) {
+		dev_err(dev, "failed map MMIO registers\n");
+		return -EIO;
 	}
 
-	error = -EINVAL;
-	of_property_read_u32(node, "riscv,ndev", &nr_irqs);
-	if (WARN_ON(!nr_irqs))
-		goto out_iounmap;
-
+	rc = plic_parse_nr_irqs_and_contexts(pdev, &nr_irqs, &nr_contexts);
+	if (rc) {
+		dev_err(dev, "failed to parse irqs and contexts\n");
+		return rc;
+	}
 	priv->nr_irqs = nr_irqs;
 
-	priv->prio_save = bitmap_alloc(nr_irqs, GFP_KERNEL);
+	priv->prio_save = devm_bitmap_zalloc(dev, nr_irqs, GFP_KERNEL);
 	if (!priv->prio_save)
-		goto out_free_priority_reg;
-
-	nr_contexts = of_irq_count(node);
-	if (WARN_ON(!nr_contexts))
-		goto out_free_priority_reg;
-
-	error = -ENOMEM;
-	priv->irqdomain = irq_domain_add_linear(node, nr_irqs + 1,
-			&plic_irqdomain_ops, priv);
-	if (WARN_ON(!priv->irqdomain))
-		goto out_free_priority_reg;
+		return -ENOMEM;
 
 	for (i = 0; i < nr_contexts; i++) {
-		struct of_phandle_args parent;
-		irq_hw_number_t hwirq;
-		int cpu;
-		unsigned long hartid;
-
-		if (of_irq_parse_one(node, i, &parent)) {
-			pr_err("failed to parse parent for context %d.\n", i);
+		rc = plic_parse_context_parent_hwirq(pdev, i,
+						     &parent_hwirq, &hartid);
+		if (rc) {
+			dev_warn(dev, "hwirq for context%d not found\n", i);
 			continue;
 		}
 
@@ -464,7 +533,7 @@  static int __init __plic_init(struct device_node *node,
 		 * Skip contexts other than external interrupts for our
 		 * privilege level.
 		 */
-		if (parent.args[0] != RV_IRQ_EXT) {
+		if (parent_hwirq != RV_IRQ_EXT) {
 			/* Disable S-mode enable bits if running in M-mode. */
 			if (IS_ENABLED(CONFIG_RISCV_M_MODE)) {
 				void __iomem *enable_base = priv->regs +
@@ -477,21 +546,17 @@  static int __init __plic_init(struct device_node *node,
 			continue;
 		}
 
-		error = riscv_of_parent_hartid(parent.np, &hartid);
-		if (error < 0) {
-			pr_warn("failed to parse hart ID for context %d.\n", i);
-			continue;
-		}
-
 		cpu = riscv_hartid_to_cpuid(hartid);
 		if (cpu < 0) {
-			pr_warn("Invalid cpuid for context %d\n", i);
+			dev_warn(dev, "Invalid cpuid for context %d\n", i);
 			continue;
 		}
 
 		/* Find parent domain and register chained handler */
-		if (!plic_parent_irq && irq_find_host(parent.np)) {
-			plic_parent_irq = irq_of_parse_and_map(node, i);
+		domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(),
+						  DOMAIN_BUS_ANY);
+		if (!plic_parent_irq && domain) {
+			plic_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
 			if (plic_parent_irq)
 				irq_set_chained_handler(plic_parent_irq,
 							plic_handle_irq);
@@ -504,7 +569,7 @@  static int __init __plic_init(struct device_node *node,
 		 */
 		handler = per_cpu_ptr(&plic_handlers, cpu);
 		if (handler->present) {
-			pr_warn("handler already present for context %d.\n", i);
+			dev_warn(dev, "handler already present for context%d.\n", i);
 			plic_set_threshold(handler, PLIC_DISABLE_THRESHOLD);
 			goto done;
 		}
@@ -518,10 +583,13 @@  static int __init __plic_init(struct device_node *node,
 			i * CONTEXT_ENABLE_SIZE;
 		handler->priv = priv;
 
-		handler->enable_save =  kcalloc(DIV_ROUND_UP(nr_irqs, 32),
-						sizeof(*handler->enable_save), GFP_KERNEL);
+		handler->enable_save =  devm_kcalloc(dev,
+						DIV_ROUND_UP(nr_irqs, 32),
+						sizeof(*handler->enable_save),
+						GFP_KERNEL);
 		if (!handler->enable_save)
-			goto out_free_enable_reg;
+			return -ENOMEM;
+
 done:
 		for (hwirq = 1; hwirq <= nr_irqs; hwirq++) {
 			plic_toggle(handler, hwirq, 0);
@@ -531,52 +599,41 @@  static int __init __plic_init(struct device_node *node,
 		nr_handlers++;
 	}
 
+	priv->irqdomain = irq_domain_create_linear(dev->fwnode, nr_irqs + 1,
+						   &plic_irqdomain_ops, priv);
+	if (WARN_ON(!priv->irqdomain))
+		return -ENOMEM;
+
 	/*
 	 * We can have multiple PLIC instances so setup cpuhp state
-	 * and register syscore operations only when context handler
-	 * for current/boot CPU is present.
+	 * and register syscore operations only after context handlers
+	 * of all online CPUs are initialized.
 	 */
-	handler = this_cpu_ptr(&plic_handlers);
-	if (handler->present && !plic_cpuhp_setup_done) {
+	cpuhp_setup = true;
+	for_each_online_cpu(cpu) {
+		handler = per_cpu_ptr(&plic_handlers, cpu);
+		if (!handler->present) {
+			cpuhp_setup = false;
+			break;
+		}
+	}
+	if (cpuhp_setup) {
 		cpuhp_setup_state(CPUHP_AP_IRQ_SIFIVE_PLIC_STARTING,
 				  "irqchip/sifive/plic:starting",
 				  plic_starting_cpu, plic_dying_cpu);
 		register_syscore_ops(&plic_irq_syscore_ops);
-		plic_cpuhp_setup_done = true;
 	}
 
-	pr_info("%pOFP: mapped %d interrupts with %d handlers for"
-		" %d contexts.\n", node, nr_irqs, nr_handlers, nr_contexts);
+	dev_info(dev, "mapped %d interrupts with %d handlers for"
+		" %d contexts.\n", nr_irqs, nr_handlers, nr_contexts);
 	return 0;
-
-out_free_enable_reg:
-	for_each_cpu(cpu, cpu_present_mask) {
-		handler = per_cpu_ptr(&plic_handlers, cpu);
-		kfree(handler->enable_save);
-	}
-out_free_priority_reg:
-	kfree(priv->prio_save);
-out_iounmap:
-	iounmap(priv->regs);
-out_free_priv:
-	kfree(priv);
-	return error;
 }
 
-static int __init plic_init(struct device_node *node,
-			    struct device_node *parent)
-{
-	return __plic_init(node, parent, 0);
-}
-
-IRQCHIP_DECLARE(sifive_plic, "sifive,plic-1.0.0", plic_init);
-IRQCHIP_DECLARE(riscv_plic0, "riscv,plic0", plic_init); /* for legacy systems */
-
-static int __init plic_edge_init(struct device_node *node,
-				 struct device_node *parent)
-{
-	return __plic_init(node, parent, BIT(PLIC_QUIRK_EDGE_INTERRUPT));
-}
-
-IRQCHIP_DECLARE(andestech_nceplic100, "andestech,nceplic100", plic_edge_init);
-IRQCHIP_DECLARE(thead_c900_plic, "thead,c900-plic", plic_edge_init);
+static struct platform_driver plic_driver = {
+	.driver = {
+		.name		= "riscv-plic",
+		.of_match_table	= plic_match,
+	},
+	.probe = plic_probe,
+};
+builtin_platform_driver(plic_driver);