diff mbox series

[v5,05/20] Input: axp20x-pek: Bail out if AXP has no interrupt line connected

Message ID 20210127172500.13356-6-andre.przywara@arm.com (mailing list archive)
State New
Headers show
Series None | expand

Commit Message

Andre Przywara Jan. 27, 2021, 5:24 p.m. UTC
On at least one board (Orangepi Zero2) the AXP305 PMIC does not have its
interrupt line connected to the CPU (mostly because the H616 SoC does
not feature an NMI pin anymore).
After allowing the AXP driver to proceed without an "interrupts"
property [1], the axp20x-pek driver crashes with a NULL pointer
dereference (see below).

Check for the regmap_irqc member to be not NULL before proceeding with
probe. This gets normally filled by the call to regmap_add_irq_chip(),
which we allow to skip now, when the DT node lacks an interrupt
property.

....
[    3.843388] sunxi-rsb 7083000.rsb: RSB running at 3000000 Hz
[    3.849972] axp20x-rsb sunxi-rsb-745: AXP20x variant AXP806 found
[    3.857971] Unable to handle kernel NULL pointer dereference at
virtual address 00000000000001b8
[    3.866855] Mem abort info:
[    3.869691]   ESR = 0x96000004
[    3.872749]   EC = 0x25: DABT (current EL), IL = 32 bits
[    3.878092]   SET = 0, FnV = 0
[    3.881149]   EA = 0, S1PTW = 0
[    3.884309] Data abort info:
[    3.887210]   ISV = 0, ISS = 0x00000004
[    3.891062]   CM = 0, WnR = 0
[    3.894049] [00000000000001b8] user address but active_mm is swapper
[    3.900590] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[    3.906166] Modules linked in:
[    3.909227] CPU: 2 PID: 34 Comm: kworker/2:1 Not tainted 5.11.0-rc1
[    3.915925] Hardware name: OrangePi Zero2 (DT)
[    3.920367] Workqueue: events deferred_probe_work_func
[    3.925518] pstate: 40000005 (nZcv daif -PAN -UAO -TCO BTYPE=--)
[    3.931522] pc : regmap_irq_get_virq+0x0/0x48
[    3.935883] lr : axp20x_pek_probe+0x78/0x200
[    3.940147] sp : ffff800012fdb450
[    3.943459] x29: ffff800012fdb450 x28: ffff0000054af340
[    3.948776] x27: ffff000005534000 x26: ffff000005534810
[    3.954091] x25: ffff800012883028 x24: 0000000000000002
[    3.959407] x23: ffff80001157eaf0 x22: ffff000005534810
[    3.964722] x21: ffff000005534800 x20: ffff0000054b0580
[    3.970037] x19: 000000000000000b x18: 0000000000000000
[    3.975353] x17: 0000000000000001 x16: 0000000000000019
[    3.980668] x15: 000002ce4ea04ae6 x14: 000000000000014f
[    3.985983] x13: 0000000000000282 x12: 0000000000000030
[    3.991298] x11: 0000000000000038 x10: 0101010101010101
[    3.996613] x9 : 0000000000000000 x8 : 7f7f7f7f7f7f7f7f
[    4.001928] x7 : ff5141435e4a444f x6 : 0000000000000080
[    4.007243] x5 : 0000000000000000 x4 : 8000000000000000
[    4.012558] x3 : 0000000000000000 x2 : 0000000000000000
[    4.017872] x1 : 000000000000000b x0 : 0000000000000000
[    4.023188] Call trace:
[    4.025635]  regmap_irq_get_virq+0x0/0x48
[    4.029646]  platform_probe+0x68/0xd8
[    4.033312]  really_probe+0xe4/0x3b0
[    4.036889]  driver_probe_device+0x58/0xb8
[    4.040986]  __device_attach_driver+0x84/0xc8
[    4.045342]  bus_for_each_drv+0x78/0xc8
[    4.049178]  __device_attach+0xf0/0x150
[    4.053013]  device_initial_probe+0x14/0x20
[    4.057196]  bus_probe_device+0x9c/0xa8
[    4.061032]  device_add+0x36c/0x7b8
[    4.064525]  platform_device_add+0x100/0x238
[    4.068796]  mfd_add_devices+0x494/0x718
[    4.072721]  axp20x_device_probe+0x70/0x158
[    4.076904]  axp20x_rsb_probe+0x94/0xd0
[    4.080741]  sunxi_rsb_device_probe+0x6c/0x88
[    4.085102]  really_probe+0xe4/0x3b0
[    4.088679]  driver_probe_device+0x58/0xb8
[    4.092776]  __device_attach_driver+0x84/0xc8
[    4.097132]  bus_for_each_drv+0x78/0xc8
[    4.100967]  __device_attach+0xf0/0x150
[    4.104803]  device_initial_probe+0x14/0x20
[    4.108986]  bus_probe_device+0x9c/0xa8
[    4.112821]  device_add+0x36c/0x7b8
[    4.116313]  device_register+0x20/0x30
[    4.120065]  sunxi_rsb_probe+0x4e4/0x608
....

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2021-January/633392.html

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 drivers/input/misc/axp20x-pek.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Dmitry Torokhov Jan. 27, 2021, 7:42 p.m. UTC | #1
Hi Andre,

On Wed, Jan 27, 2021 at 05:24:45PM +0000, Andre Przywara wrote:
> On at least one board (Orangepi Zero2) the AXP305 PMIC does not have its
> interrupt line connected to the CPU (mostly because the H616 SoC does
> not feature an NMI pin anymore).
> After allowing the AXP driver to proceed without an "interrupts"
> property [1], the axp20x-pek driver crashes with a NULL pointer
> dereference (see below).
> 
> Check for the regmap_irqc member to be not NULL before proceeding with
> probe. This gets normally filled by the call to regmap_add_irq_chip(),
> which we allow to skip now, when the DT node lacks an interrupt
> property.

No, the driver is not the right place to patch this; regmap should be
fixed so it does not crash instead.

Thanks.
Andre Przywara Jan. 28, 2021, 11:11 a.m. UTC | #2
On Wed, 27 Jan 2021 11:42:15 -0800
Dmitry Torokhov <dmitry.torokhov@gmail.com> wrote:

Hi Dmitry,

thanks for your feedback!

> On Wed, Jan 27, 2021 at 05:24:45PM +0000, Andre Przywara wrote:
> > On at least one board (Orangepi Zero2) the AXP305 PMIC does not have its
> > interrupt line connected to the CPU (mostly because the H616 SoC does
> > not feature an NMI pin anymore).
> > After allowing the AXP driver to proceed without an "interrupts"
> > property [1], the axp20x-pek driver crashes with a NULL pointer
> > dereference (see below).
> > 
> > Check for the regmap_irqc member to be not NULL before proceeding with
> > probe. This gets normally filled by the call to regmap_add_irq_chip(),
> > which we allow to skip now, when the DT node lacks an interrupt
> > property.  
> 
> No, the driver is not the right place to patch this; regmap should be
> fixed so it does not crash instead.

I am not sure this is the right approach, those regmap functions look
more like an internal interface to me, with lots of wrapper functions
happily dereferencing pointers and reaching into structs. Moving
NULL checks into those does not sound like the right thing. CC:ing Mark
for more opinions on this.

A more general solution would be to not instantiate this driver here
at all, when we don't have an interrupt line.
However at the moment the AXP MFD driver uses a const struct to hold
all MFD cells, so there is no easy way of omitting the power key
device dynamically. And even then it would hard code the requirement
for an interrupt into the MFD driver, when this could be considered an
implementation detail of the axp20x-pek driver.

That's why I came up with this patch here, which was the easiest and
cleanest: This driver *requires* a valid regmap_irqc, so it should
verify this at probe time, kind of like a normal driver would bail out
if no IRQ line could be reserved.

Let me know what you think!

Cheers,
Andre
Mark Brown Jan. 28, 2021, 11:36 a.m. UTC | #3
On Thu, Jan 28, 2021 at 11:11:28AM +0000, Andre Przywara wrote:
> Dmitry Torokhov <dmitry.torokhov@gmail.com> wrote:
> > On Wed, Jan 27, 2021 at 05:24:45PM +0000, Andre Przywara wrote:

> > > Check for the regmap_irqc member to be not NULL before proceeding with
> > > probe. This gets normally filled by the call to regmap_add_irq_chip(),
> > > which we allow to skip now, when the DT node lacks an interrupt
> > > property.  

It sounds like you're trying to register an IRQ chip with a somehow
bogus configuration?

> > No, the driver is not the right place to patch this; regmap should be
> > fixed so it does not crash instead.

> I am not sure this is the right approach, those regmap functions look
> more like an internal interface to me, with lots of wrapper functions
> happily dereferencing pointers and reaching into structs. Moving
> NULL checks into those does not sound like the right thing. CC:ing Mark
> for more opinions on this.

Without having seen the actual issue if you're trying to register an
interrupt controller with a known broken hardware configuration that
does seem like something the caller just shouldn't be doing, it's not
something that's going to transiently happen at runtime and we're very
much trusting that the caller got things right.

> A more general solution would be to not instantiate this driver here
> at all, when we don't have an interrupt line.
> However at the moment the AXP MFD driver uses a const struct to hold
> all MFD cells, so there is no easy way of omitting the power key
> device dynamically. And even then it would hard code the requirement
> for an interrupt into the MFD driver, when this could be considered an
> implementation detail of the axp20x-pek driver.

Another approach is to just register the optional device separately.
Andre Przywara Jan. 28, 2021, 12:31 p.m. UTC | #4
On Thu, 28 Jan 2021 11:36:01 +0000
Mark Brown <broonie@kernel.org> wrote:

> On Thu, Jan 28, 2021 at 11:11:28AM +0000, Andre Przywara wrote:
> > Dmitry Torokhov <dmitry.torokhov@gmail.com> wrote:  
> > > On Wed, Jan 27, 2021 at 05:24:45PM +0000, Andre Przywara wrote:  
> 
> > > > Check for the regmap_irqc member to be not NULL before proceeding with
> > > > probe. This gets normally filled by the call to regmap_add_irq_chip(),
> > > > which we allow to skip now, when the DT node lacks an interrupt
> > > > property.    
> 
> It sounds like you're trying to register an IRQ chip with a somehow
> bogus configuration?

Quick background: Those AXP PMICs have an IRQ pin, that was always
connected to the NMI pin on Allwinner SoCs. This was used for the power
button GPIO interrupt. Now the H616 does not have this pin anymore, and
the board does not use a GPIO either.
I patched the AXP MFD driver [1] to skip the regmap-irq creation when no
interrupts DT property was found, but this NULL pointer now
understandably confuses the -pek driver, and leads to this crash:

http://lists.infradead.org/pipermail/linux-arm-kernel/2021-January/634969.html

Hence I wanted to plug this hole, which seems useful regardless of this
particular issue.

[1]
http://lists.infradead.org/pipermail/linux-arm-kernel/2021-January/634971.html

> > > No, the driver is not the right place to patch this; regmap should be
> > > fixed so it does not crash instead.  
> 
> > I am not sure this is the right approach, those regmap functions look
> > more like an internal interface to me, with lots of wrapper functions
> > happily dereferencing pointers and reaching into structs. Moving
> > NULL checks into those does not sound like the right thing. CC:ing Mark
> > for more opinions on this.  
> 
> Without having seen the actual issue if you're trying to register an
> interrupt controller with a known broken hardware configuration that
> does seem like something the caller just shouldn't be doing, it's not
> something that's going to transiently happen at runtime and we're very
> much trusting that the caller got things right.
> 
> > A more general solution would be to not instantiate this driver here
> > at all, when we don't have an interrupt line.
> > However at the moment the AXP MFD driver uses a const struct to hold
> > all MFD cells, so there is no easy way of omitting the power key
> > device dynamically. And even then it would hard code the requirement
> > for an interrupt into the MFD driver, when this could be considered an
> > implementation detail of the axp20x-pek driver.  
> 
> Another approach is to just register the optional device separately.

I will have a look at how much this takes.

Thanks,
Andre
Mark Brown Jan. 28, 2021, 3:05 p.m. UTC | #5
On Thu, Jan 28, 2021 at 12:31:36PM +0000, Andre Przywara wrote:
> Mark Brown <broonie@kernel.org> wrote:

> > It sounds like you're trying to register an IRQ chip with a somehow
> > bogus configuration?

> I patched the AXP MFD driver [1] to skip the regmap-irq creation when no
> interrupts DT property was found, but this NULL pointer now
> understandably confuses the -pek driver, and leads to this crash:

> http://lists.infradead.org/pipermail/linux-arm-kernel/2021-January/634969.html

> Hence I wanted to plug this hole, which seems useful regardless of this
> particular issue.

The driver code here looks pretty confused.  It appears to be looking up
the interrupt to use from a resource (which is what I'd expect for a MFD
child) then for reasons I can't fathom trying to pass that resource into
regmap_irq_get_virq() which is at best going to just return the value
that was passed in but may potentially end up just returning a random
interrupt other than the one that was asked for since we're passing in a
global interrupt number rather than a controller relative one.  I really
can't tell what's supposed to be going on there.  A driver should either
use resources or it should use regmap_irq_get_virq(), using both is a
bug.

The MFD for this device is also just plain buggy in that it is providing
IRQ resources to the children when there is in fact no support for the
interrupts on the device in the system.  This means that the MFD core
sees that it has no interrupt domain, assumes that those interrupt
resources are in fact absolute interrupt numbers and passes them
straight through to the children.  This means that the children will
just be requesting random interrupts in the system which may actually
exist and be requestable which probably isn't going to end well.  When
there is no interrupt controller the parent should not be trying to
supply interrupt resources to the children at all.
diff mbox series

Patch

diff --git a/drivers/input/misc/axp20x-pek.c b/drivers/input/misc/axp20x-pek.c
index 9c6386b2af33..abe52ef194ee 100644
--- a/drivers/input/misc/axp20x-pek.c
+++ b/drivers/input/misc/axp20x-pek.c
@@ -309,6 +309,10 @@  static int axp20x_pek_probe(struct platform_device *pdev)
 
 	axp20x_pek->axp20x = dev_get_drvdata(pdev->dev.parent);
 
+	/* In case there is no interrupt line from the AXP towards the CPU. */
+	if (!axp20x_pek->axp20x->regmap_irqc)
+		return -ENODEV;
+
 	axp20x_pek->irq_dbr = platform_get_irq_byname(pdev, "PEK_DBR");
 	if (axp20x_pek->irq_dbr < 0)
 		return axp20x_pek->irq_dbr;