diff mbox series

[2/2] spi: spi-fsl-dspi: Initialize completion before possible interrupt

Message ID 1592132154-20175-2-git-send-email-krzk@kernel.org (mailing list archive)
State Superseded
Headers show
Series [1/2] spi: spi-fsl-dspi: Fix external abort on interrupt in exit paths | expand

Commit Message

Krzysztof Kozlowski June 14, 2020, 10:55 a.m. UTC
If interrupt fires early, the dspi_interrupt() could complete
(dspi->xfer_done) before its initialization happens.

Fixes: 4f5ee75ea171 ("spi: spi-fsl-dspi: Replace interruptible wait queue with a simple completion")
Cc: <stable@vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
---
 drivers/spi/spi-fsl-dspi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Vladimir Oltean June 14, 2020, 11:14 a.m. UTC | #1
On Sun, 14 Jun 2020 at 13:56, Krzysztof Kozlowski <krzk@kernel.org> wrote:
>
> If interrupt fires early, the dspi_interrupt() could complete
> (dspi->xfer_done) before its initialization happens.
>
> Fixes: 4f5ee75ea171 ("spi: spi-fsl-dspi: Replace interruptible wait queue with a simple completion")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
> ---

Why would an interrupt fire before spi_register_controller, therefore
before dspi_transfer_one_message could get called?
Is this master or slave mode?

>  drivers/spi/spi-fsl-dspi.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
> index 57e7a626ba00..efb63ed9fd86 100644
> --- a/drivers/spi/spi-fsl-dspi.c
> +++ b/drivers/spi/spi-fsl-dspi.c
> @@ -1385,6 +1385,8 @@ static int dspi_probe(struct platform_device *pdev)
>                 goto poll_mode;
>         }
>
> +       init_completion(&dspi->xfer_done);
> +
>         ret = request_threaded_irq(dspi->irq, dspi_interrupt, NULL,
>                                    IRQF_SHARED, pdev->name, dspi);
>         if (ret < 0) {
> @@ -1392,8 +1394,6 @@ static int dspi_probe(struct platform_device *pdev)
>                 goto out_clk_put;
>         }
>
> -       init_completion(&dspi->xfer_done);
> -
>  poll_mode:
>
>         if (dspi->devtype_data->trans_mode == DSPI_DMA_MODE) {
> --
> 2.7.4
>
Krzysztof Kozlowski June 14, 2020, 11:18 a.m. UTC | #2
On Sun, Jun 14, 2020 at 02:14:15PM +0300, Vladimir Oltean wrote:
> On Sun, 14 Jun 2020 at 13:56, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> >
> > If interrupt fires early, the dspi_interrupt() could complete
> > (dspi->xfer_done) before its initialization happens.
> >
> > Fixes: 4f5ee75ea171 ("spi: spi-fsl-dspi: Replace interruptible wait queue with a simple completion")
> > Cc: <stable@vger.kernel.org>
> > Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
> > ---
> 
> Why would an interrupt fire before spi_register_controller, therefore
> before dspi_transfer_one_message could get called?
> Is this master or slave mode?

I guess practically it won't fire.  It's more of a matter of logical
order and:
1. Someone might fix the CONFIG_DEBUG_SHIRQ_FIXME one day,
2. The hardware is actually initialized before and someone could attach
   to SPI bus some weird device.

Best regards,
Krzysztof
Wolfram Sang June 14, 2020, 11:18 a.m. UTC | #3
> > If interrupt fires early, the dspi_interrupt() could complete
> > (dspi->xfer_done) before its initialization happens.
> >
> > Fixes: 4f5ee75ea171 ("spi: spi-fsl-dspi: Replace interruptible wait queue with a simple completion")
> > Cc: <stable@vger.kernel.org>
> > Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
> > ---
> 
> Why would an interrupt fire before spi_register_controller, therefore
> before dspi_transfer_one_message could get called?

I don't know this HW, but the generic answer usually is: Bootloader used
SPI and didn't clean up properly.
Vladimir Oltean June 14, 2020, 1:39 p.m. UTC | #4
On Sun, 14 Jun 2020 at 14:18, Krzysztof Kozlowski <krzk@kernel.org> wrote:
>
> On Sun, Jun 14, 2020 at 02:14:15PM +0300, Vladimir Oltean wrote:
> > On Sun, 14 Jun 2020 at 13:56, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> > >
> > > If interrupt fires early, the dspi_interrupt() could complete
> > > (dspi->xfer_done) before its initialization happens.
> > >
> > > Fixes: 4f5ee75ea171 ("spi: spi-fsl-dspi: Replace interruptible wait queue with a simple completion")
> > > Cc: <stable@vger.kernel.org>
> > > Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
> > > ---
> >
> > Why would an interrupt fire before spi_register_controller, therefore
> > before dspi_transfer_one_message could get called?
> > Is this master or slave mode?
>
> I guess practically it won't fire.  It's more of a matter of logical
> order and:
> 1. Someone might fix the CONFIG_DEBUG_SHIRQ_FIXME one day,

And what if CONFIG_DEBUG_SHIRQ_FIXME gets fixed? I uncommented it, and
still no issues. dspi_interrupt checks the status bit of the hw, sees
there's nothing to do, and returns IRQ_NONE.

> 2. The hardware is actually initialized before and someone could attach
>    to SPI bus some weird device.
>

Some weird device that does what?

> Best regards,
> Krzysztof
>

Thanks,
-Vladimir
Vladimir Oltean June 14, 2020, 1:43 p.m. UTC | #5
On Sun, 14 Jun 2020 at 16:39, Vladimir Oltean <olteanv@gmail.com> wrote:
>
> On Sun, 14 Jun 2020 at 14:18, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> >
> > On Sun, Jun 14, 2020 at 02:14:15PM +0300, Vladimir Oltean wrote:
> > > On Sun, 14 Jun 2020 at 13:56, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> > > >
> > > > If interrupt fires early, the dspi_interrupt() could complete
> > > > (dspi->xfer_done) before its initialization happens.
> > > >
> > > > Fixes: 4f5ee75ea171 ("spi: spi-fsl-dspi: Replace interruptible wait queue with a simple completion")

Also please note that this patch merely replaced an
init_waitqueue_head with init_completion. But the "bug" (if we can
call it that) originates from even before.

> > > > Cc: <stable@vger.kernel.org>
> > > > Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
> > > > ---
> > >
> > > Why would an interrupt fire before spi_register_controller, therefore
> > > before dspi_transfer_one_message could get called?
> > > Is this master or slave mode?
> >
> > I guess practically it won't fire.  It's more of a matter of logical
> > order and:
> > 1. Someone might fix the CONFIG_DEBUG_SHIRQ_FIXME one day,
>
> And what if CONFIG_DEBUG_SHIRQ_FIXME gets fixed? I uncommented it, and
> still no issues. dspi_interrupt checks the status bit of the hw, sees
> there's nothing to do, and returns IRQ_NONE.
>
> > 2. The hardware is actually initialized before and someone could attach
> >    to SPI bus some weird device.
> >
>
> Some weird device that does what?
>
> > Best regards,
> > Krzysztof
> >
>
> Thanks,
> -Vladimir
Krzysztof Kozlowski June 14, 2020, 3:12 p.m. UTC | #6
On Sun, Jun 14, 2020 at 04:43:28PM +0300, Vladimir Oltean wrote:
> On Sun, 14 Jun 2020 at 16:39, Vladimir Oltean <olteanv@gmail.com> wrote:
> >
> > On Sun, 14 Jun 2020 at 14:18, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> > >
> > > On Sun, Jun 14, 2020 at 02:14:15PM +0300, Vladimir Oltean wrote:
> > > > On Sun, 14 Jun 2020 at 13:56, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> > > > >
> > > > > If interrupt fires early, the dspi_interrupt() could complete
> > > > > (dspi->xfer_done) before its initialization happens.
> > > > >
> > > > > Fixes: 4f5ee75ea171 ("spi: spi-fsl-dspi: Replace interruptible wait queue with a simple completion")
> 
> Also please note that this patch merely replaced an
> init_waitqueue_head with init_completion. But the "bug" (if we can
> call it that) originates from even before.

Yeah, I know, the Fixes is not accurate. Backport to earlier kernels
would be manual so I am not sure if accurate Fixes matter.

> 
> > > > > Cc: <stable@vger.kernel.org>
> > > > > Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
> > > > > ---
> > > >
> > > > Why would an interrupt fire before spi_register_controller, therefore
> > > > before dspi_transfer_one_message could get called?
> > > > Is this master or slave mode?
> > >
> > > I guess practically it won't fire.  It's more of a matter of logical
> > > order and:
> > > 1. Someone might fix the CONFIG_DEBUG_SHIRQ_FIXME one day,
> >
> > And what if CONFIG_DEBUG_SHIRQ_FIXME gets fixed? I uncommented it, and
> > still no issues. dspi_interrupt checks the status bit of the hw, sees
> > there's nothing to do, and returns IRQ_NONE.

Indeed, still the logical way of initializing is to do it before any
possible use.

> >
> > > 2. The hardware is actually initialized before and someone could attach
> > >    to SPI bus some weird device.
> > >
> >
> > Some weird device that does what?

You never know what people will connect to a SoM :).

Wolfram made actually much better point - bootloaders are known to
initialize some things and leaving them in whatever state, assuming that
Linux kernel will redo any initialization properly.

Best regards,
Krzysztof
Vladimir Oltean June 14, 2020, 3:34 p.m. UTC | #7
On Sun, 14 Jun 2020 at 18:12, Krzysztof Kozlowski <krzk@kernel.org> wrote:
>
> On Sun, Jun 14, 2020 at 04:43:28PM +0300, Vladimir Oltean wrote:
> > On Sun, 14 Jun 2020 at 16:39, Vladimir Oltean <olteanv@gmail.com> wrote:
> > >
> > > On Sun, 14 Jun 2020 at 14:18, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> > > >
> > > > On Sun, Jun 14, 2020 at 02:14:15PM +0300, Vladimir Oltean wrote:
> > > > > On Sun, 14 Jun 2020 at 13:56, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> > > > > >
> > > > > > If interrupt fires early, the dspi_interrupt() could complete
> > > > > > (dspi->xfer_done) before its initialization happens.
> > > > > >
> > > > > > Fixes: 4f5ee75ea171 ("spi: spi-fsl-dspi: Replace interruptible wait queue with a simple completion")
> >
> > Also please note that this patch merely replaced an
> > init_waitqueue_head with init_completion. But the "bug" (if we can
> > call it that) originates from even before.
>
> Yeah, I know, the Fixes is not accurate. Backport to earlier kernels
> would be manual so I am not sure if accurate Fixes matter.
>
> >
> > > > > > Cc: <stable@vger.kernel.org>
> > > > > > Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
> > > > > > ---
> > > > >
> > > > > Why would an interrupt fire before spi_register_controller, therefore
> > > > > before dspi_transfer_one_message could get called?
> > > > > Is this master or slave mode?
> > > >
> > > > I guess practically it won't fire.  It's more of a matter of logical
> > > > order and:
> > > > 1. Someone might fix the CONFIG_DEBUG_SHIRQ_FIXME one day,
> > >
> > > And what if CONFIG_DEBUG_SHIRQ_FIXME gets fixed? I uncommented it, and
> > > still no issues. dspi_interrupt checks the status bit of the hw, sees
> > > there's nothing to do, and returns IRQ_NONE.
>
> Indeed, still the logical way of initializing is to do it before any
> possible use.
>
> > >
> > > > 2. The hardware is actually initialized before and someone could attach
> > > >    to SPI bus some weird device.
> > > >
> > >
> > > Some weird device that does what?
>
> You never know what people will connect to a SoM :).
>
> Wolfram made actually much better point - bootloaders are known to
> initialize some things and leaving them in whatever state, assuming that
> Linux kernel will redo any initialization properly.
>
> Best regards,
> Krzysztof
>

I don't buy the argument.
So ok, maybe some broken bootloader leaves a SPI_SR interrupt pending
(do you have any example of that?). But the driver clears interrupts
by writing SPI_SR_CLEAR in dspi_init (called _before_ requesting the
IRQ). It clears 10 bits from the status register. There are 2 points
to be made here:
- The dspi_interrupt only handles data availability interrupt
(SPI_SR_EOQF | SPI_SR_CMDTCF). Only then does it matter whether the
completion was already initialized or not. But these interrupts _are_
cleared. But assume they weren't. What would Linux even do with a SPI
transfer initiated by the previously running software environment? Why
would it be a smart thing to handle that data in the first place?
- The 10 bits from the status register are all the bits that can be
cleared. The rest of the register, if you look at it, contains the TX
FIFO Counter, the Transmit Next Pointer, the RX FIFO Counter, and the
Pop Next Pointer.
So, unless there's something I'm missing, I don't actually see how
this broken bootloader can do any harm to us.

Thanks,
-Vladimir
Krzysztof Kozlowski June 15, 2020, 7:08 a.m. UTC | #8
On Sun, Jun 14, 2020 at 06:34:33PM +0300, Vladimir Oltean wrote:
> On Sun, 14 Jun 2020 at 18:12, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> >
> > On Sun, Jun 14, 2020 at 04:43:28PM +0300, Vladimir Oltean wrote:
> > > On Sun, 14 Jun 2020 at 16:39, Vladimir Oltean <olteanv@gmail.com> wrote:
> > > >
> > > > On Sun, 14 Jun 2020 at 14:18, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> > > > >
> > > > > On Sun, Jun 14, 2020 at 02:14:15PM +0300, Vladimir Oltean wrote:
> > > > > > On Sun, 14 Jun 2020 at 13:56, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> > > > > > >
> > > > > > > If interrupt fires early, the dspi_interrupt() could complete
> > > > > > > (dspi->xfer_done) before its initialization happens.
> > > > > > >
> > > > > > > Fixes: 4f5ee75ea171 ("spi: spi-fsl-dspi: Replace interruptible wait queue with a simple completion")
> > >
> > > Also please note that this patch merely replaced an
> > > init_waitqueue_head with init_completion. But the "bug" (if we can
> > > call it that) originates from even before.
> >
> > Yeah, I know, the Fixes is not accurate. Backport to earlier kernels
> > would be manual so I am not sure if accurate Fixes matter.
> >
> > >
> > > > > > > Cc: <stable@vger.kernel.org>
> > > > > > > Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
> > > > > > > ---
> > > > > >
> > > > > > Why would an interrupt fire before spi_register_controller, therefore
> > > > > > before dspi_transfer_one_message could get called?
> > > > > > Is this master or slave mode?
> > > > >
> > > > > I guess practically it won't fire.  It's more of a matter of logical
> > > > > order and:
> > > > > 1. Someone might fix the CONFIG_DEBUG_SHIRQ_FIXME one day,
> > > >
> > > > And what if CONFIG_DEBUG_SHIRQ_FIXME gets fixed? I uncommented it, and
> > > > still no issues. dspi_interrupt checks the status bit of the hw, sees
> > > > there's nothing to do, and returns IRQ_NONE.
> >
> > Indeed, still the logical way of initializing is to do it before any
> > possible use.
> >
> > > >
> > > > > 2. The hardware is actually initialized before and someone could attach
> > > > >    to SPI bus some weird device.
> > > > >
> > > >
> > > > Some weird device that does what?
> >
> > You never know what people will connect to a SoM :).
> >
> > Wolfram made actually much better point - bootloaders are known to
> > initialize some things and leaving them in whatever state, assuming that
> > Linux kernel will redo any initialization properly.
> >
> > Best regards,
> > Krzysztof
> >
> 
> I don't buy the argument.
> So ok, maybe some broken bootloader leaves a SPI_SR interrupt pending
> (do you have any example of that?). But the driver clears interrupts
> by writing SPI_SR_CLEAR in dspi_init (called _before_ requesting the
> IRQ). It clears 10 bits from the status register. There are 2 points
> to be made here:
> - The dspi_interrupt only handles data availability interrupt
> (SPI_SR_EOQF | SPI_SR_CMDTCF). Only then does it matter whether the
> completion was already initialized or not. But these interrupts _are_
> cleared. But assume they weren't. What would Linux even do with a SPI
> transfer initiated by the previously running software environment? Why
> would it be a smart thing to handle that data in the first place?
> - The 10 bits from the status register are all the bits that can be
> cleared. The rest of the register, if you look at it, contains the TX
> FIFO Counter, the Transmit Next Pointer, the RX FIFO Counter, and the
> Pop Next Pointer.
> So, unless there's something I'm missing, I don't actually see how
> this broken bootloader can do any harm to us.

Let's rephrase it: you think therefore that completion should be
initialzed *after* requesting shared interrupts? You think that exactly
that order shall be used in the source code?

Best regards,
Krzysztof
Vladimir Oltean June 15, 2020, 9:26 a.m. UTC | #9
On Mon, 15 Jun 2020 at 10:09, Krzysztof Kozlowski <krzk@kernel.org> wrote:
>
> On Sun, Jun 14, 2020 at 06:34:33PM +0300, Vladimir Oltean wrote:
> > On Sun, 14 Jun 2020 at 18:12, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> > >
> > > On Sun, Jun 14, 2020 at 04:43:28PM +0300, Vladimir Oltean wrote:
> > > > On Sun, 14 Jun 2020 at 16:39, Vladimir Oltean <olteanv@gmail.com> wrote:
> > > > >
> > > > > On Sun, 14 Jun 2020 at 14:18, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> > > > > >
> > > > > > On Sun, Jun 14, 2020 at 02:14:15PM +0300, Vladimir Oltean wrote:
> > > > > > > On Sun, 14 Jun 2020 at 13:56, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> > > > > > > >
> > > > > > > > If interrupt fires early, the dspi_interrupt() could complete
> > > > > > > > (dspi->xfer_done) before its initialization happens.
> > > > > > > >
> > > > > > > > Fixes: 4f5ee75ea171 ("spi: spi-fsl-dspi: Replace interruptible wait queue with a simple completion")
> > > >
> > > > Also please note that this patch merely replaced an
> > > > init_waitqueue_head with init_completion. But the "bug" (if we can
> > > > call it that) originates from even before.
> > >
> > > Yeah, I know, the Fixes is not accurate. Backport to earlier kernels
> > > would be manual so I am not sure if accurate Fixes matter.
> > >
> > > >
> > > > > > > > Cc: <stable@vger.kernel.org>
> > > > > > > > Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
> > > > > > > > ---
> > > > > > >
> > > > > > > Why would an interrupt fire before spi_register_controller, therefore
> > > > > > > before dspi_transfer_one_message could get called?
> > > > > > > Is this master or slave mode?
> > > > > >
> > > > > > I guess practically it won't fire.  It's more of a matter of logical
> > > > > > order and:
> > > > > > 1. Someone might fix the CONFIG_DEBUG_SHIRQ_FIXME one day,
> > > > >
> > > > > And what if CONFIG_DEBUG_SHIRQ_FIXME gets fixed? I uncommented it, and
> > > > > still no issues. dspi_interrupt checks the status bit of the hw, sees
> > > > > there's nothing to do, and returns IRQ_NONE.
> > >
> > > Indeed, still the logical way of initializing is to do it before any
> > > possible use.
> > >
> > > > >
> > > > > > 2. The hardware is actually initialized before and someone could attach
> > > > > >    to SPI bus some weird device.
> > > > > >
> > > > >
> > > > > Some weird device that does what?
> > >
> > > You never know what people will connect to a SoM :).
> > >
> > > Wolfram made actually much better point - bootloaders are known to
> > > initialize some things and leaving them in whatever state, assuming that
> > > Linux kernel will redo any initialization properly.
> > >
> > > Best regards,
> > > Krzysztof
> > >
> >
> > I don't buy the argument.
> > So ok, maybe some broken bootloader leaves a SPI_SR interrupt pending
> > (do you have any example of that?). But the driver clears interrupts
> > by writing SPI_SR_CLEAR in dspi_init (called _before_ requesting the
> > IRQ). It clears 10 bits from the status register. There are 2 points
> > to be made here:
> > - The dspi_interrupt only handles data availability interrupt
> > (SPI_SR_EOQF | SPI_SR_CMDTCF). Only then does it matter whether the
> > completion was already initialized or not. But these interrupts _are_
> > cleared. But assume they weren't. What would Linux even do with a SPI
> > transfer initiated by the previously running software environment? Why
> > would it be a smart thing to handle that data in the first place?
> > - The 10 bits from the status register are all the bits that can be
> > cleared. The rest of the register, if you look at it, contains the TX
> > FIFO Counter, the Transmit Next Pointer, the RX FIFO Counter, and the
> > Pop Next Pointer.
> > So, unless there's something I'm missing, I don't actually see how
> > this broken bootloader can do any harm to us.
>
> Let's rephrase it: you think therefore that completion should be
> initialzed *after* requesting shared interrupts? You think that exactly
> that order shall be used in the source code?
>
> Best regards,
> Krzysztof
>
>

I think that completion should be initialized before it is used, just
like any other variable. So far you have not proven any code path
through which it can be used uninitialized, therefore I don't see why
this should be accepted as a bug fix. Cleanup, cosmetic refactoring,
design patterns, whatever, sure.

Thanks,
-Vladimir
Krzysztof Kozlowski June 15, 2020, 9:30 a.m. UTC | #10
On Mon, Jun 15, 2020 at 12:26:37PM +0300, Vladimir Oltean wrote:
 > Let's rephrase it: you think therefore that completion should be
> > initialzed *after* requesting shared interrupts? You think that exactly
> > that order shall be used in the source code?
> >
> > Best regards,
> > Krzysztof
> >
> >
> 
> I think that completion should be initialized before it is used, just
> like any other variable. So far you have not proven any code path
> through which it can be used uninitialized, therefore I don't see why
> this should be accepted as a bug fix. Cleanup, cosmetic refactoring,
> design patterns, whatever, sure.

Sure, let it call then cleanup, cosmetic refactoring.

Best regards,
Krzysztof
diff mbox series

Patch

diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index 57e7a626ba00..efb63ed9fd86 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -1385,6 +1385,8 @@  static int dspi_probe(struct platform_device *pdev)
 		goto poll_mode;
 	}
 
+	init_completion(&dspi->xfer_done);
+
 	ret = request_threaded_irq(dspi->irq, dspi_interrupt, NULL,
 				   IRQF_SHARED, pdev->name, dspi);
 	if (ret < 0) {
@@ -1392,8 +1394,6 @@  static int dspi_probe(struct platform_device *pdev)
 		goto out_clk_put;
 	}
 
-	init_completion(&dspi->xfer_done);
-
 poll_mode:
 
 	if (dspi->devtype_data->trans_mode == DSPI_DMA_MODE) {