[v2,05/10] spi: bcm2835: Drop dma_pending flag

Message ID	062b03b7f86af77a13ce0ec3b22e0bdbfcfba10d.1568187525.git.lukas@wunner.de (mailing list archive)
State	Accepted
Headers	show Return-Path: <SRS0=u/Zo=XG=vger.kernel.org=linux-spi-owner@kernel.org> Message-Id: <062b03b7f86af77a13ce0ec3b22e0bdbfcfba10d.1568187525.git.lukas@wunner.de> In-Reply-To: <cover.1568187525.git.lukas@wunner.de> References: <cover.1568187525.git.lukas@wunner.de> From: Lukas Wunner <lukas@wunner.de> Date: Wed, 11 Sep 2019 12:15:30 +0200 Subject: [PATCH v2 05/10] spi: bcm2835: Drop dma_pending flag MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit To: Mark Brown <broonie@kernel.org> Cc: linux-spi@vger.kernel.org, linux-rpi-kernel@lists.infradead.org Sender: linux-spi-owner@vger.kernel.org Precedence: bulk
Series	Speed up SPI simplex transfers on Raspberry Pi \| expand [v2,00/10] Speed up SPI simplex transfers on Raspberry Pi [v2,01/10] dmaengine: bcm2835: Allow reusable descriptors [v2,02/10] dmaengine: bcm2835: Allow cyclic transactions without interrupt [v2,03/10] spi: Guarantee cacheline alignment of driver-private data [v2,04/10] spi: bcm2835: Work around DONE bit erratum [v2,05/10] spi: bcm2835: Drop dma_pending flag [v2,06/10] spi: bcm2835: Cache CS register value for ->prepare_message() [v2,07/10] spi: bcm2835: Speed up TX-only DMA transfers by clearing RX FIFO [v2,08/10] dmaengine: bcm2835: Document struct bcm2835_dmadev [v2,09/10] dmaengine: bcm2835: Avoid accessing memory when copying zeroes [v2,10/10] spi: bcm2835: Speed up RX-only DMA transfers by zero-filling TX FIFO

Message ID

062b03b7f86af77a13ce0ec3b22e0bdbfcfba10d.1568187525.git.lukas@wunner.de (mailing list archive)

State

Accepted

Headers

Message-Id: 
 <062b03b7f86af77a13ce0ec3b22e0bdbfcfba10d.1568187525.git.lukas@wunner.de>
In-Reply-To: <cover.1568187525.git.lukas@wunner.de>
References: <cover.1568187525.git.lukas@wunner.de>
From: Lukas Wunner <lukas@wunner.de>
Date: Wed, 11 Sep 2019 12:15:30 +0200
Subject: [PATCH v2 05/10] spi: bcm2835: Drop dma_pending flag
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
To: Mark Brown <broonie@kernel.org>
Cc: linux-spi@vger.kernel.org, linux-rpi-kernel@lists.infradead.org
Sender: linux-spi-owner@vger.kernel.org
Precedence: bulk

Series

Speed up SPI simplex transfers on Raspberry Pi | expand

Commit Message

Lukas Wunner Sept. 11, 2019, 10:15 a.m. UTC

The BCM2835 SPI driver uses a flag to keep track of whether a DMA
transfer is in progress.

The flag is used to avoid terminating DMA channels multiple times if a
transfer finishes orderly while simultaneously the SPI core invokes the
->handle_err() callback because the transfer took too long.  However
terminating DMA channels multiple times is perfectly fine, so the flag
is unnecessary for this particular purpose.

The flag is also used to avoid invoking bcm2835_spi_undo_prologue()
multiple times under this race condition.  However multiple *concurrent*
invocations can no longer happen since commit 2527704d8411 ("spi:
bcm2835: Synchronize with callback on DMA termination") because the
->handle_err() callback now uses the _sync() variant when terminating
DMA channels.

The only raison d'être of the flag is therefore that
bcm2835_spi_undo_prologue() cannot cope with multiple *sequential*
invocations.  Achieve that by setting tx_prologue to 0 at the end of
the function.  Subsequent invocations thus become no-ops.

With that, the dma_pending flag becomes unnecessary, so drop it.

Tested-by: Nuno Sá <nuno.sa@analog.com>
Tested-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Stefan Wahren <wahrenst@gmx.net>
Acked-by: Martin Sperl <kernel@martin.sperl.org>
---
 drivers/spi/spi-bcm2835.c | 23 ++++++++---------------
 1 file changed, 8 insertions(+), 15 deletions(-)

Comments

Marc Kleine-Budde July 19, 2022, 6:52 a.m. UTC | #1

On 11.09.2019 12:15:30, Lukas Wunner wrote:
> The BCM2835 SPI driver uses a flag to keep track of whether a DMA
> transfer is in progress.
>
> The flag is used to avoid terminating DMA channels multiple times if a
> transfer finishes orderly while simultaneously the SPI core invokes the
> ->handle_err() callback because the transfer took too long.  However
> terminating DMA channels multiple times is perfectly fine, so the flag
> is unnecessary for this particular purpose.
>
> The flag is also used to avoid invoking bcm2835_spi_undo_prologue()
> multiple times under this race condition.  However multiple *concurrent*
> invocations can no longer happen since commit 2527704d8411 ("spi:
> bcm2835: Synchronize with callback on DMA termination") because the
> ->handle_err() callback now uses the _sync() variant when terminating
> DMA channels.
>
> The only raison d'être of the flag is therefore that
> bcm2835_spi_undo_prologue() cannot cope with multiple *sequential*
> invocations.  Achieve that by setting tx_prologue to 0 at the end of
> the function.  Subsequent invocations thus become no-ops.
>
> With that, the dma_pending flag becomes unnecessary, so drop it.
>
> Tested-by: Nuno Sá <nuno.sa@analog.com>
> Tested-by: Noralf Trønnes <noralf@tronnes.org>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Acked-by: Stefan Wahren <wahrenst@gmx.net>
> Acked-by: Martin Sperl <kernel@martin.sperl.org>

I think this patch breaks the bcm2835_spi_handle_err() function, which
may be called for non-DMA transfers, too. In my test case a non DMA, non
polling transfer runs into a timeout and produces this backtrace:

| [ 1651.800430] spidev spi3.0: SPI transfer timed out   
| [ 1651.800468] Internal error: Oops: 206 [#1] PREEMPT_RT SMP ARM
| [ 1651.800473] Modules linked in: can_raw can brcmfmac mcp251xfd can_dev spidev brcmutil bcm2835_codec(C) bcm2835_v4l2(C) v4l2_mem2mem bcm2835_isp(C) videobuf2_vmalloc bcm2835_mmal_vchiq(C) cfg80211 videobuf2_dma_contig dwc2 videobuf2_memops videobuf2_v4l2 videobuf2_common
| videodev raspberrypi_hwmon vc_sm_cma(C) rfkill roles spi_bcm2835aux spi_bcm2835 mc rpivid_mem uio_pdrv_genirq uio drm fuse drm_panel_orientation_quirks backlight ip_tables x_tables ipv6
| [ 1651.800533] CPU: 1 PID: 766 Comm: SpiCanTest Tainted: G         C        5.15.40-rt43-v7l+ #2
| [ 1651.800538] Hardware name: BCM2711
| [ 1651.800540] PC is at bcm2835_spi_handle_err+0x20/0xe0 [spi_bcm2835]
| [ 1651.800555] LR is at spi_transfer_one_message+0x54c/0x68c
| [ 1651.800566] pc : [<bf17ce9c>]    lr : [<c0a26000>]    psr: a0030113
| [ 1651.800569] sp : c2947d50  ip : c2947d70  fp : c2947d6c
| [ 1651.800572] r10: c21ce2f8  r9 : ffffff92  r8 : c1558340
| [ 1651.800574] r7 : c2947eac  r6 : 00000000  r5 : c21ce000  r4 : c21ce3c0
| [ 1651.800577] r3 : bf17ce7c  r2 : 00000000  r1 : c2947eac  r0 : c21ce000
| [ 1651.800580] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
| [ 1651.800584] Control: 30c5383d  Table: 03049980  DAC: 55555555
| [ 1651.800587] Register r0 information: slab kmalloc-2k start c21ce000 pointer offset 0 size 2048
| [ 1651.800599] Register r1 information: non-slab/vmalloc memory
| [ 1651.800604] Register r2 information: NULL pointer
| [ 1651.800608] Register r3 information: 5-page vmalloc region starting at 0xbf17c000 allocated at load_module+0xc98/0x2bc8
| [ 1651.800618] Register r4 information: slab kmalloc-2k start c21ce000 pointer offset 960 size 2048
| [ 1651.800627] Register r5 information: slab kmalloc-2k start c21ce000 pointer offset 0 size 2048
| [ 1651.800636] Register r6 information: NULL pointer
| [ 1651.800639] Register r7 information: non-slab/vmalloc memory
| [ 1651.800643] Register r8 information: non-slab/vmalloc memory
| [ 1651.800646] Register r9 information: non-paged memory
| [ 1651.800650] Register r10 information: slab kmalloc-2k start c21ce000 pointer offset 760 size 2048
| [ 1651.800658] Register r11 information: non-slab/vmalloc memory
| [ 1651.800662] Register r12 information: non-slab/vmalloc memory
| [ 1651.800665] Process SpiCanTest (pid: 766, stack limit = 0x071eb77b)
| [ 1651.800669] Stack: (0xc2947d50 to 0xc2948000)
| [ 1651.800673] 7d40:                                     c3dada40 c4d92380 c21ce000 c2947eac
| [ 1651.800678] 7d60: c2947dcc c2947d70 c0a26000 bf17ce88 c2947db4 00000000 c3dad800 c3dad800
| [ 1651.800682] 7d80: 0147ae0e c10f7fc4 c140756f c21ce2f8 ffffe000 c140689c 00000000 c3dada40
| [ 1651.800686] 7da0: c21ce210 c21ce000 00000000 bf17d050 c2947ed0 c3dada40 c21ce2f8 c2947eac
| [ 1651.800689] 7dc0: c2947e0c c2947dd0 c0a281a0 c0a25ac0 c025643c d55cd1f7 c21ce210 c3dad800
| [ 1651.800693] 7de0: c2947eac c21ce000 c3dad800 c2947eac c21ce000 c3dada40 c21ce2f8 c0a23fc0
| [ 1651.800697] 7e00: c2947e5c c2947e10 c0a28920 c0a27dac c21ce288 c21ce230 00000000 00000000
| [ 1651.800700] 7e20: c2947e20 c2947e20 c0d398d4 d55cd1f7 c2947e5c c3dad800 c2947eac c3dad800
| [ 1651.800704] 7e40: c44987c0 c44987c4 c44987e0 000000ff c2947e74 c2947e60 c0a28a1c c0a28700
| [ 1651.800707] 7e60: c3dad800 c36459e0 c2947f14 c2947e78 bf223a5c c0a289f4 c2947ee8 c44987c4
| [ 1651.800711] 7e80: 00000001 c4d92380 c44987e0 c3dad800 c36459c0 c44987c0 bf226000 000000ff
| [ 1651.800714] 7ea0: c36459e0 c4412100 ffffe000 c4d923e0 c4d923e0 c3dad800 00000000 c0a23388
| [ 1651.800718] 7ec0: c2947e18 000000ff 00000000 ffffff92 c2947ed0 c2947ed0 00000000 c2947edc
| [ 1651.800721] 7ee0: c2947edc d55cd1f7 b6bd4cb0 40206b00 00000000 c50b2001 b6bd4cb0 c50b2000
| [ 1651.800725] 7f00: 00000003 c470a780 c2947fa4 c2947f18 c04726f4 bf223408 b6f3c330 00000193
| [ 1651.800728] 7f20: c0200244 b6bd4cf0 00000010 00000000 00000673 c08601cc c2947f74 c2947f48
| [ 1651.800732] 7f40: c02b6cf0 c08601b8 00000673 00000000 2228e32f 00000000 c2947f74 d55cd1f7
| [ 1651.800735] 7f60: b6bd4cf0 00000001 c2947fa4 c2947f78 c02c92b8 d55cd1f7 00000673 00000003
| [ 1651.800739] 7f80: 005020d0 00502000 00000036 c0200244 c2946000 00000036 00000000 c2947fa8
| [ 1651.800742] 7fa0: c0200040 c04725e0 00000003 005020d0 00000003 40206b00 b6bd4cb0 00000000
| [ 1651.800746] 7fc0: 00000003 005020d0 00502000 00000036 2202be12 b6f35010 00000000 b6bd4e90
| [ 1651.800749] 7fe0: 00502044 b6bd4c74 004f17eb b6cd6da8 400e0030 00000003 00000000 00000000
| [ 1651.800751] Backtrace:
| [ 1651.800754] [<bf17ce7c>] (bcm2835_spi_handle_err [spi_bcm2835]) from [<c0a26000>] (spi_transfer_one_message+0x54c/0x68c)
| [ 1651.800768]  r7:c2947eac r6:c21ce000 r5:c4d92380 r4:c3dada40
| [ 1651.800770] [<c0a25ab4>] (spi_transfer_one_message) from [<c0a281a0>] (__spi_pump_messages+0x400/0x930)
| [ 1651.800781]  r10:c2947eac r9:c21ce2f8 r8:c3dada40 r7:c2947ed0 r6:bf17d050 r5:00000000
| [ 1651.800783]  r4:c21ce000
| [ 1651.800785] [<c0a27da0>] (__spi_pump_messages) from [<c0a28920>] (__spi_sync+0x22c/0x2f4)
| [ 1651.800795]  r10:c0a23fc0 r9:c21ce2f8 r8:c3dada40 r7:c21ce000 r6:c2947eac r5:c3dad800
| [ 1651.800797]  r4:c21ce000
| [ 1651.800798] [<c0a286f4>] (__spi_sync) from [<c0a28a1c>] (spi_sync+0x34/0x4c)
| [ 1651.800808]  r10:000000ff r9:c44987e0 r8:c44987c4 r7:c44987c0 r6:c3dad800 r5:c2947eac
| [ 1651.800810]  r4:c3dad800

> ---
>  drivers/spi/spi-bcm2835.c | 23 ++++++++---------------
>  1 file changed, 8 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/spi/spi-bcm2835.c b/drivers/spi/spi-bcm2835.c
> index f79f04ea42e5..532c58bcfd45 100644
> --- a/drivers/spi/spi-bcm2835.c
> +++ b/drivers/spi/spi-bcm2835.c
[...]
> @@ -927,11 +921,10 @@ static void bcm2835_spi_handle_err(struct spi_controller *ctlr,
>  	struct bcm2835_spi *bs = spi_controller_get_devdata(ctlr);
>
>  	/* if an error occurred and we have an active dma, then terminate */
> -	if (cmpxchg(&bs->dma_pending, true, false)) {
> -		dmaengine_terminate_sync(ctlr->dma_tx);
> -		dmaengine_terminate_sync(ctlr->dma_rx);
> -		bcm2835_spi_undo_prologue(bs);
> -	}
> +	dmaengine_terminate_sync(ctlr->dma_tx);
> +	dmaengine_terminate_sync(ctlr->dma_rx);

... because the ctrl->dma_tx and ->dma_rx are NULL.

> +	bcm2835_spi_undo_prologue(bs);
> +
>  	/* and reset */
>  	bcm2835_spi_reset_hw(ctlr);
>  }

The question is: Why runs the IRQ based transfer into a timeout? The
kernel that produces the crash has ecfbd3cf3b8b ("spi: bcm2835: Enable
shared interrupt support") applied (which was reverted on mainline in a
later patch).

I'll create a patch to fix the NULL pointer deref. As a interrupt based
transfer might run into a timeout for other reasons, too. So better
avoid a kernel crash in that case.

regards,
Marc

--
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

Stefan Wahren July 19, 2022, 7:34 a.m. UTC | #2

Hi Marc,

Am 19.07.22 um 08:52 schrieb Marc Kleine-Budde:
> The question is: Why runs the IRQ based transfer into a timeout? The
> kernel that produces the crash has ecfbd3cf3b8b ("spi: bcm2835: Enable
> shared interrupt support") applied (which was reverted on mainline in a
> later patch).

thank for your report. Unfortunately i don't have an answer to your 
question, but maybe this is related:

https://github.com/raspberrypi/linux/commit/c643a3603dcbe6d0feada33915cef1ef896b865e

> I'll create a patch to fix the NULL pointer deref. As a interrupt based
> transfer might run into a timeout for other reasons, too. So better
> avoid a kernel crash in that case.
Yes, please
>
> regards,
> Marc
>
> --
> Pengutronix e.K.                 | Marc Kleine-Budde           |
> Embedded Linux                   | https://www.pengutronix.de  |
> Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
> Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |
>
> _______________________________________________
> linux-rpi-kernel mailing list
> linux-rpi-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-rpi-kernel

Marc Kleine-Budde July 19, 2022, 7:45 a.m. UTC | #3

On 19.07.2022 09:34:45, Stefan Wahren wrote:
> Am 19.07.22 um 08:52 schrieb Marc Kleine-Budde:
> > The question is: Why runs the IRQ based transfer into a timeout? The
> > kernel that produces the crash has ecfbd3cf3b8b ("spi: bcm2835: Enable
> > shared interrupt support") applied (which was reverted on mainline in a
> > later patch).
> 
> thank for your report. Unfortunately i don't have an answer to your
> question, but maybe this is related:

Sorry, that question was rhetorical. The kernel that produced this crash
was too old, it has support for the shared SPI interrupts ecfbd3cf3b8b
("spi: bcm2835: Enable shared interrupt support"), but misses the fix
commit 46feb7d7241b ("spi: bcm2835: Fix for shared interrupts").

With the 46feb7d7241b ("spi: bcm2835: Fix for shared interrupts") SPI
transmit timeout doesn't happen any more and thus the NULL pointer
deref is not triggered.

> https://github.com/raspberrypi/linux/commit/c643a3603dcbe6d0feada33915cef1ef896b865e
> 
> > I'll create a patch to fix the NULL pointer deref. As a interrupt based
> > transfer might run into a timeout for other reasons, too. So better
> > avoid a kernel crash in that case.
> 
> Yes, please

Done:
https://lore.kernel.org/all/20220719072234.2782764-1-mkl@pengutronix.de

regards,
Marc

diff --git a/drivers/spi/spi-bcm2835.c b/drivers/spi/spi-bcm2835.c
index f79f04ea42e5..532c58bcfd45 100644
--- a/drivers/spi/spi-bcm2835.c
+++ b/drivers/spi/spi-bcm2835.c
@@ -94,7 +94,6 @@  MODULE_PARM_DESC(polling_limit_us,
  * @rx_prologue: bytes received without DMA if first RX sglist entry's
  *	length is not a multiple of 4 (to overcome hardware limitation)
  * @tx_spillover: whether @tx_prologue spills over to second TX sglist entry
- * @dma_pending: whether a DMA transfer is in progress
  * @debugfs_dir: the debugfs directory - neede to remove debugfs when
  *      unloading the module
  * @count_transfer_polling: count of how often polling mode is used
@@ -117,7 +116,6 @@  struct bcm2835_spi {
 	int tx_prologue;
 	int rx_prologue;
 	unsigned int tx_spillover;
-	unsigned int dma_pending;
 
 	struct dentry *debugfs_dir;
 	u64 count_transfer_polling;
@@ -551,6 +549,8 @@  static void bcm2835_spi_undo_prologue(struct bcm2835_spi *bs)
 		sg_dma_address(&tfr->tx_sg.sgl[1]) -= 4;
 		sg_dma_len(&tfr->tx_sg.sgl[1])     += 4;
 	}
+
+	bs->tx_prologue = 0;
 }
 
 static void bcm2835_spi_dma_done(void *data)
@@ -566,10 +566,8 @@  static void bcm2835_spi_dma_done(void *data)
 	 * is called the tx-dma must have finished - can't get to this
 	 * situation otherwise...
 	 */
-	if (cmpxchg(&bs->dma_pending, true, false)) {
-		dmaengine_terminate_async(ctlr->dma_tx);
-		bcm2835_spi_undo_prologue(bs);
-	}
+	dmaengine_terminate_async(ctlr->dma_tx);
+	bcm2835_spi_undo_prologue(bs);
 
 	/* and mark as completed */;
 	complete(&ctlr->xfer_completion);
@@ -644,9 +642,6 @@  static int bcm2835_spi_transfer_one_dma(struct spi_controller *ctlr,
 	/* start TX early */
 	dma_async_issue_pending(ctlr->dma_tx);
 
-	/* mark as dma pending */
-	bs->dma_pending = 1;
-
 	/* set the DMA length */
 	bcm2835_wr(bs, BCM2835_SPI_DLEN, bs->tx_len);
 
@@ -662,7 +657,6 @@  static int bcm2835_spi_transfer_one_dma(struct spi_controller *ctlr,
 	if (ret) {
 		/* need to reset on errors */
 		dmaengine_terminate_sync(ctlr->dma_tx);
-		bs->dma_pending = false;
 		goto err_reset_hw;
 	}
 
@@ -927,11 +921,10 @@  static void bcm2835_spi_handle_err(struct spi_controller *ctlr,
 	struct bcm2835_spi *bs = spi_controller_get_devdata(ctlr);
 
 	/* if an error occurred and we have an active dma, then terminate */
-	if (cmpxchg(&bs->dma_pending, true, false)) {
-		dmaengine_terminate_sync(ctlr->dma_tx);
-		dmaengine_terminate_sync(ctlr->dma_rx);
-		bcm2835_spi_undo_prologue(bs);
-	}
+	dmaengine_terminate_sync(ctlr->dma_tx);
+	dmaengine_terminate_sync(ctlr->dma_rx);
+	bcm2835_spi_undo_prologue(bs);
+
 	/* and reset */
 	bcm2835_spi_reset_hw(ctlr);
 }

[v2,05/10] spi: bcm2835: Drop dma_pending flag

Commit Message

Comments

Patch