diff mbox

Revert "irqchip: irq-dove: Add PMU interrupt controller."

Message ID 1393911160-7688-1-git-send-email-jason@lakedaemon.net (mailing list archive)
State New, archived
Headers show

Commit Message

Jason Cooper March 4, 2014, 5:32 a.m. UTC
This reverts commit 40b367d95fb3d60fc1edb9ba8f6ef52272e48936.

Russell King has raised the idea of creating a proper PMU driver for
this SoC that would incorporate the functionality currently in this
driver.  It would also cover the use case for the graphics subsystem on
this SoC.

To prevent having to maintain the devicetree ABI for this limited
interrupt-handler driver, we revert the driver before it hits a mainline
tagged release (eg v3.15).

Signed-off-by: Jason Cooper <jason@lakedaemon.net>
---
Thomas,

Well, this is embarrassing.  It took so long to get this driver on the road to
mainline, only to realize today that we were going to paint ourselves into a
corner wrt the devicetree ABI this creates.

So, rather let a bad ABI make it to mainline, we revert the driver.  We'll sit
down with Russell, who's most familiar with the graphics subsystem on this SoC,
and hammer out a better longterm solution.

You can pick this, or I can send you an incremental pull request.  Whichever is
easiest for you.  I've already removed the DT node from the mvebu tree, so this
won't be used by anything.

thx,

Jason.

 .../interrupt-controller/marvell,dove-pmu-intc.txt |  17 ---
 drivers/irqchip/Makefile                           |   1 -
 drivers/irqchip/irq-dove.c                         | 126 ---------------------
 3 files changed, 144 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/interrupt-controller/marvell,dove-pmu-intc.txt
 delete mode 100644 drivers/irqchip/irq-dove.c

Comments

Russell King - ARM Linux March 5, 2014, 12:41 a.m. UTC | #1
On Tue, Mar 04, 2014 at 05:32:40AM +0000, Jason Cooper wrote:
> -static void dove_pmu_irq_handler(unsigned int irq, struct irq_desc *desc)
> -{
> -	struct irq_domain *d = irq_get_handler_data(irq);
> -	struct irq_chip_generic *gc = irq_get_domain_generic_chip(d, 0);
> -	u32 stat = readl_relaxed(gc->reg_base + DOVE_PMU_IRQ_CAUSE) &
> -		   gc->mask_cache;
> -
> -	while (stat) {
> -		u32 hwirq = ffs(stat) - 1;
> -
> -		generic_handle_irq(irq_find_mapping(d, gc->irq_base + hwirq));
> -		stat &= ~(1 << hwirq);
> -	}
> -}
> -
> -static void pmu_irq_ack(struct irq_data *d)
> -{
> -	struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
> -	struct irq_chip_type *ct = irq_data_get_chip_type(d);
> -	u32 mask = ~d->mask;
> -
> -	/*
> -	 * The PMU mask register is not RW0C: it is RW.	 This means that
> -	 * the bits take whatever value is written to them; if you write
> -	 * a '1', you will set the interrupt.
> -	 *
> -	 * Unfortunately this means there is NO race free way to clear
> -	 * these interrupts.
> -	 *
> -	 * So, let's structure the code so that the window is as small as
> -	 * possible.
> -	 */
> -	irq_gc_lock(gc);
> -	mask &= irq_reg_readl(gc->reg_base +  ct->regs.ack);
> -	irq_reg_writel(mask, gc->reg_base +  ct->regs.ack);
> -	irq_gc_unlock(gc);
> -}

Jason, Thomas,

I've just been giving the above a whirl here with the RTC, and it
doesn't seem to quite work as it should.  Not your problem, because it's
as the code is originally.

Let's say you set an alarm for 10sec time.  When the alarm fires:

- we read the PMU interrupt status, mask it with the mask register,
  and find the RTC pending.
- we call the genirq layer for this interrupt.
- genirq does the mask + ack thing.
- the RTC interrupt handler is called, and there's the RTC says there's
  an interrupt pending.
- the RTC handler clears the interrupt, and returns.
- genirq unmasks the interrupt, and returns.
- dove_pmu_irq_handler() is re-entered, and again, we find that the
  RTC interrupt is pending.
- follow the above...
- the RTC interrupt handler is called, but this time there's no interrupt
  pending, so returns IRQ_NONE
- genirq unmasks the interrupt, and returns.

The reason this happens is that the attempt to "ack" - rather "clear" the
interrupt the first time around has no effect - the RTC is still asserting
the interrupt, so the write to clear the register results in the bit
remaining set.

The second time around, we've already cleared the RTC interrupt, so this
time, the ack clears the interrupt down properly.

In some ways, this is good news - it shows that the bits in this register
latch '1' when an interrupt is pending, and remain '1' while the block
continues to assert its interrupt signal - but can we say that the other
interrupt functions in this register have that behaviour?

From the spec, it looks like this is probably true of DFSDone as well.
DVSDone - I see no separate status register containing status bits
indicating what the cause of the DVSDone status is.  The thermal bits -
if it's a transitory excursion, may not hold.  Battery fault... we
can guess.

Now, genirq doesn't have a good way to handle this.  I'll also say that
because of the above, I've always been worried about hardware races when
trying to clear down interrupts in this register - I'd much prefer not
to touch it unless absolutely necessary.  So... how about this instead?

        u32 stat = readl_relaxed(gc->reg_base + DOVE_PMC_IRQ_CAUSE) &
                   gc->mask_cache;
	u32 done = ~0;

	while (stat) {
		unsigned hwirq = ffs(stat) - 1;

		stat &= ~(1 << hwirq);
		done &= ~(1 << hwirq);

		generic_handle_irq(irq_find_mapping(domain, hwirq));
	}

	irq_gc_lock(gc);
	done &= readl_relaxed(gc->reg_base + DOVE_PMC_IRQ_CAUSE);
	writel_relaxed(done, gc->reg_base + DOVE_PMC_IRQ_CAUSE);
	irq_gc_unlock(gc);

This results in the RTC alarm test receiving exactly one interrupt for
each alarm expiry, as it should do.  Thoughts?

Another question: ffs(stat) - any reason to use ffs() there rather than
fls(stat) which would result in simpler code?  r1 = ffs(r4 = stat) creates:

 198:   e2641000        rsb     r1, r4, #0
 19c:   e1a00006        mov     r0, r6
 1a0:   e0011004        and     r1, r1, r4
 1a4:   e16f1f11        clz     r1, r1
 1a8:   e261101f        rsb     r1, r1, #31

whereas fls(stat):

 198:   e16f1f14        clz     r1, r4
 19c:   e261101f        rsb     r1, r1, #31
 1a0:   e1a00006        mov     r0, r6

Kind of a micro-optimisation, but I see no reason to prefer one over the
other except for this - and I think the switch to ffs() was made in the
hope of optimising this code!
Andrew Lunn March 5, 2014, 9:24 a.m. UTC | #2
> >From the spec, it looks like this is probably true of DFSDone as well.

The cpufreq driver makes use of DFSDone. I've never had it loop
endlessly. There is also no action needed to clear it in the DFS
hardware.

	Andrew
Carlo Caione March 5, 2014, 11:52 a.m. UTC | #3
On Wed, Mar 5, 2014 at 1:41 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Tue, Mar 04, 2014 at 05:32:40AM +0000, Jason Cooper wrote:
>
> Jason, Thomas,
>
> I've just been giving the above a whirl here with the RTC, and it
> doesn't seem to quite work as it should.  Not your problem, because it's
> as the code is originally.
>
> Let's say you set an alarm for 10sec time.  When the alarm fires:
>
> - we read the PMU interrupt status, mask it with the mask register,
>   and find the RTC pending.
> - we call the genirq layer for this interrupt.
> - genirq does the mask + ack thing.
> - the RTC interrupt handler is called, and there's the RTC says there's
>   an interrupt pending.
> - the RTC handler clears the interrupt, and returns.
> - genirq unmasks the interrupt, and returns.
> - dove_pmu_irq_handler() is re-entered, and again, we find that the
>   RTC interrupt is pending.
> - follow the above...
> - the RTC interrupt handler is called, but this time there's no interrupt
>   pending, so returns IRQ_NONE
> - genirq unmasks the interrupt, and returns.
>
> The reason this happens is that the attempt to "ack" - rather "clear" the
> interrupt the first time around has no effect - the RTC is still asserting
> the interrupt, so the write to clear the register results in the bit
> remaining set.
>
> The second time around, we've already cleared the RTC interrupt, so this
> time, the ack clears the interrupt down properly.
>
> In some ways, this is good news - it shows that the bits in this register
> latch '1' when an interrupt is pending, and remain '1' while the block
> continues to assert its interrupt signal - but can we say that the other
> interrupt functions in this register have that behaviour?

I don't know if this could help but I have a very similar problem with
the NMI controller found on sun6i/sun7i for which is pending the
patchset http://comments.gmane.org/gmane.comp.hardware.netbook.arm.sunxi/7554

Also in that case the first ACK has no effect and in the end I use the
unmask hook in the irqchip driver to ACK again the line before
unmasking it (the whole discussion about this problem among me, Maxime
and Hans is in this thread
http://www.spinics.net/lists/arm-kernel/msg299952.html )

I also proposed a small (and as it turned out incomplete) patch here
http://www.spinics.net/lists/arm-kernel/msg305616.html but in the end
I preferred to use the unmask hook.
Thomas Gleixner March 5, 2014, 2:42 p.m. UTC | #4
On Wed, 5 Mar 2014, Russell King - ARM Linux wrote:
> In some ways, this is good news - it shows that the bits in this register
> latch '1' when an interrupt is pending, and remain '1' while the block
> continues to assert its interrupt signal - but can we say that the other
> interrupt functions in this register have that behaviour?
> 
> >From the spec, it looks like this is probably true of DFSDone as well.
> DVSDone - I see no separate status register containing status bits
> indicating what the cause of the DVSDone status is.  The thermal bits -
> if it's a transitory excursion, may not hold.  Battery fault... we
> can guess.
> 
> Now, genirq doesn't have a good way to handle this.  I'll also say that
> because of the above, I've always been worried about hardware races when
> trying to clear down interrupts in this register - I'd much prefer not
> to touch it unless absolutely necessary.  So... how about this instead?
> 
>         u32 stat = readl_relaxed(gc->reg_base + DOVE_PMC_IRQ_CAUSE) &
>                    gc->mask_cache;
> 	u32 done = ~0;
> 
> 	while (stat) {
> 		unsigned hwirq = ffs(stat) - 1;
> 
> 		stat &= ~(1 << hwirq);
> 		done &= ~(1 << hwirq);
> 
> 		generic_handle_irq(irq_find_mapping(domain, hwirq));
> 	}
> 
> 	irq_gc_lock(gc);
> 	done &= readl_relaxed(gc->reg_base + DOVE_PMC_IRQ_CAUSE);
> 	writel_relaxed(done, gc->reg_base + DOVE_PMC_IRQ_CAUSE);
> 	irq_gc_unlock(gc);
> 
> This results in the RTC alarm test receiving exactly one interrupt for
> each alarm expiry, as it should do.  Thoughts?

You are worried about clearing an interrupt which is transitory and
not kept active at the device level until you handled it for real,
right?

Is the datasheet for this stuff public available?

> Another question: ffs(stat) - any reason to use ffs() there rather than
> fls(stat) which would result in simpler code?  r1 = ffs(r4 = stat) creates:
> 
>  198:   e2641000        rsb     r1, r4, #0
>  19c:   e1a00006        mov     r0, r6
>  1a0:   e0011004        and     r1, r1, r4
>  1a4:   e16f1f11        clz     r1, r1
>  1a8:   e261101f        rsb     r1, r1, #31
> 
> whereas fls(stat):
> 
>  198:   e16f1f14        clz     r1, r4
>  19c:   e261101f        rsb     r1, r1, #31
>  1a0:   e1a00006        mov     r0, r6
> 
> Kind of a micro-optimisation, but I see no reason to prefer one over the
> other except for this - and I think the switch to ffs() was made in the
> hope of optimising this code!

I don't think it matters in which order you process multiple pending
interrupts.

Thanks,

	tglx
Russell King - ARM Linux March 5, 2014, 7:20 p.m. UTC | #5
On Wed, Mar 05, 2014 at 03:42:34PM +0100, Thomas Gleixner wrote:
> On Wed, 5 Mar 2014, Russell King - ARM Linux wrote:
> > This results in the RTC alarm test receiving exactly one interrupt for
> > each alarm expiry, as it should do.  Thoughts?
> 
> You are worried about clearing an interrupt which is transitory and
> not kept active at the device level until you handled it for real,
> right?

Yep.  Let's take the code:

	ldr	r0, [r1]		; read the interrupt cause register
	and	r0, r0, r2		; clear interrupts we've serviced
	str	r0, [r1]		; write it back

The problem here is if a transitory interrupt is received between the
load and store, the write can clear it back to zero.  There's nothing
which can be done to get around that - which is why I'd prefer to do
this as infrequently as necessary.

> Is the datasheet for this stuff public available?

Thankfully, it is, but like many such things, it'll leave you with /lots/
of questions.  In the case of this register, the documentation only goes
as far as describing the bits, but doesn't really describe their behaviour.
Much of that can only come via experimentation with the hardware. :(

> I don't think it matters in which order you process multiple pending
> interrupts.

Me neither - I'm just going to use fls() for no other reason that it
produces more efficient code.  My comments on that were to see whether
I'd missed anything, and to stave off any review comments about why
it's changed :)
Thomas Gleixner March 5, 2014, 9:36 p.m. UTC | #6
On Wed, 5 Mar 2014, Russell King - ARM Linux wrote:
> On Wed, Mar 05, 2014 at 03:42:34PM +0100, Thomas Gleixner wrote:
> > On Wed, 5 Mar 2014, Russell King - ARM Linux wrote:
> > > This results in the RTC alarm test receiving exactly one interrupt for
> > > each alarm expiry, as it should do.  Thoughts?
> > 
> > You are worried about clearing an interrupt which is transitory and
> > not kept active at the device level until you handled it for real,
> > right?
> 
> Yep.  Let's take the code:
> 
> 	ldr	r0, [r1]		; read the interrupt cause register
> 	and	r0, r0, r2		; clear interrupts we've serviced
> 	str	r0, [r1]		; write it back
> 
> The problem here is if a transitory interrupt is received between the
> load and store, the write can clear it back to zero.  There's nothing
> which can be done to get around that - which is why I'd prefer to do
> this as infrequently as necessary.

Yes, that's the only sensible thing you can do. Is there any transient
interrupt connected to that irq controller or is this as vague as the
rest of the documentation?
 
> > Is the datasheet for this stuff public available?
> 
> Thankfully, it is, but like many such things, it'll leave you with /lots/
> of questions.  In the case of this register, the documentation only goes
> as far as describing the bits, but doesn't really describe their behaviour.
> Much of that can only come via experimentation with the hardware. :(

Sigh. I really want to understand why SoC companies waste lots of
resources to implement pointless and disfunctional variants of
interrupt controllers (or timers) over and over.

Thanks,

	tglx
diff mbox

Patch

diff --git a/Documentation/devicetree/bindings/interrupt-controller/marvell,dove-pmu-intc.txt b/Documentation/devicetree/bindings/interrupt-controller/marvell,dove-pmu-intc.txt
deleted file mode 100644
index 1feb5825d372..000000000000
--- a/Documentation/devicetree/bindings/interrupt-controller/marvell,dove-pmu-intc.txt
+++ /dev/null
@@ -1,17 +0,0 @@ 
-Marvell Dove Power Management Unit interrupt controller
-
-Required properties:
-- compatible: shall be "marvell,dove-pmu-intc"
-- reg: base address of PMU interrupt registers starting with CAUSE register
-- interrupts: PMU interrupt of the main interrupt controller
-- interrupt-controller: identifies the node as an interrupt controller
-- #interrupt-cells: number of cells to encode an interrupt source, shall be 1
-
-Example:
-	pmu_intc: pmu-interrupt-ctrl@d0050 {
-		compatible = "marvell,dove-pmu-intc";
-		interrupt-controller;
-		#interrupt-cells = <1>;
-		reg = <0xd0050 0x8>;
-		interrupts = <33>;
-	};
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index f743006ce7ad..c60b9010b152 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -1,7 +1,6 @@ 
 obj-$(CONFIG_IRQCHIP)			+= irqchip.o
 
 obj-$(CONFIG_ARCH_BCM2835)		+= irq-bcm2835.o
-obj-$(CONFIG_ARCH_DOVE)			+= irq-dove.o
 obj-$(CONFIG_ARCH_EXYNOS)		+= exynos-combiner.o
 obj-$(CONFIG_ARCH_MMP)			+= irq-mmp.o
 obj-$(CONFIG_ARCH_MVEBU)		+= irq-armada-370-xp.o
diff --git a/drivers/irqchip/irq-dove.c b/drivers/irqchip/irq-dove.c
deleted file mode 100644
index 788acd89848a..000000000000
--- a/drivers/irqchip/irq-dove.c
+++ /dev/null
@@ -1,126 +0,0 @@ 
-/*
- * Marvell Dove SoCs PMU IRQ chip driver.
- *
- * Andrew Lunn <andrew@lunn.ch>
- *
- * This file is licensed under the terms of the GNU General Public
- * License version 2.  This program is licensed "as is" without any
- * warranty of any kind, whether express or implied.
- */
-
-#include <linux/io.h>
-#include <linux/irq.h>
-#include <linux/of.h>
-#include <linux/of_address.h>
-#include <linux/of_irq.h>
-#include <asm/exception.h>
-#include <asm/mach/irq.h>
-
-#include "irqchip.h"
-
-#define DOVE_PMU_IRQ_CAUSE	0x00
-#define DOVE_PMU_IRQ_MASK	0x04
-
-static void dove_pmu_irq_handler(unsigned int irq, struct irq_desc *desc)
-{
-	struct irq_domain *d = irq_get_handler_data(irq);
-	struct irq_chip_generic *gc = irq_get_domain_generic_chip(d, 0);
-	u32 stat = readl_relaxed(gc->reg_base + DOVE_PMU_IRQ_CAUSE) &
-		   gc->mask_cache;
-
-	while (stat) {
-		u32 hwirq = ffs(stat) - 1;
-
-		generic_handle_irq(irq_find_mapping(d, gc->irq_base + hwirq));
-		stat &= ~(1 << hwirq);
-	}
-}
-
-static void pmu_irq_ack(struct irq_data *d)
-{
-	struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
-	struct irq_chip_type *ct = irq_data_get_chip_type(d);
-	u32 mask = ~d->mask;
-
-	/*
-	 * The PMU mask register is not RW0C: it is RW.	 This means that
-	 * the bits take whatever value is written to them; if you write
-	 * a '1', you will set the interrupt.
-	 *
-	 * Unfortunately this means there is NO race free way to clear
-	 * these interrupts.
-	 *
-	 * So, let's structure the code so that the window is as small as
-	 * possible.
-	 */
-	irq_gc_lock(gc);
-	mask &= irq_reg_readl(gc->reg_base +  ct->regs.ack);
-	irq_reg_writel(mask, gc->reg_base +  ct->regs.ack);
-	irq_gc_unlock(gc);
-}
-
-static int __init dove_pmu_irq_init(struct device_node *np,
-				    struct device_node *parent)
-{
-	unsigned int clr = IRQ_NOREQUEST | IRQ_NOPROBE | IRQ_NOAUTOEN;
-	struct resource r;
-	struct irq_domain *domain;
-	struct irq_chip_generic *gc;
-	int ret, irq, nrirqs = 7;
-
-	domain = irq_domain_add_linear(np, nrirqs,
-				       &irq_generic_chip_ops, NULL);
-	if (!domain) {
-		pr_err("%s: unable to add irq domain\n", np->name);
-		return -ENOMEM;
-	}
-
-	ret = irq_alloc_domain_generic_chips(domain, nrirqs, 1, np->name,
-			     handle_level_irq, clr, 0, IRQ_GC_INIT_MASK_CACHE);
-	if (ret) {
-		pr_err("%s: unable to alloc irq domain gc\n", np->name);
-		return ret;
-	}
-
-	ret = of_address_to_resource(np, 0, &r);
-	if (ret) {
-		pr_err("%s: unable to get resource\n", np->name);
-		return ret;
-	}
-
-	if (!request_mem_region(r.start, resource_size(&r), np->name)) {
-		pr_err("%s: unable to request mem region\n", np->name);
-		return -ENOMEM;
-	}
-
-	/* Map the parent interrupt for the chained handler */
-	irq = irq_of_parse_and_map(np, 0);
-	if (irq <= 0) {
-		pr_err("%s: unable to parse irq\n", np->name);
-		return -EINVAL;
-	}
-
-	gc = irq_get_domain_generic_chip(domain, 0);
-	gc->reg_base = ioremap(r.start, resource_size(&r));
-	if (!gc->reg_base) {
-		pr_err("%s: unable to map resource\n", np->name);
-		return -ENOMEM;
-	}
-
-	gc->chip_types[0].regs.ack = DOVE_PMU_IRQ_CAUSE;
-	gc->chip_types[0].regs.mask = DOVE_PMU_IRQ_MASK;
-	gc->chip_types[0].chip.irq_ack = pmu_irq_ack;
-	gc->chip_types[0].chip.irq_mask = irq_gc_mask_clr_bit;
-	gc->chip_types[0].chip.irq_unmask = irq_gc_mask_set_bit;
-
-	/* mask and clear all interrupts */
-	writel(0, gc->reg_base + DOVE_PMU_IRQ_MASK);
-	writel(0, gc->reg_base + DOVE_PMU_IRQ_CAUSE);
-
-	irq_set_handler_data(irq, domain);
-	irq_set_chained_handler(irq, dove_pmu_irq_handler);
-
-	return 0;
-}
-IRQCHIP_DECLARE(dove_pmu_intc,
-		"marvell,dove-pmu-intc", dove_pmu_irq_init);