diff mbox series

[3/7] spi: bcm2835: Fix race on DMA termination

Message ID 4100e8ad7f8673db3f586439c831cfa4ec909cf4.1541659680.git.lukas@wunner.de (mailing list archive)
State Accepted
Commit e82b0b3828451c1cd331d9f304c6078fcd43b62e
Headers show
Series Raspberry Pi spi0 improvements | expand

Commit Message

Lukas Wunner Nov. 8, 2018, 7:06 a.m. UTC
If a DMA transfer finishes orderly right when spi_transfer_one_message()
determines that it has timed out, the callbacks bcm2835_spi_dma_done()
and bcm2835_spi_handle_err() race to call dmaengine_terminate_all(),
potentially leading to double termination.

Prevent by atomically changing the dma_pending flag before calling
dmaengine_terminate_all().

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Fixes: 3ecd37edaa2a ("spi: bcm2835: enable dma modes for transfers meeting certain conditions")
Cc: stable@vger.kernel.org # v4.2+
Cc: Mathias Duckeck <m.duckeck@kunbus.de>
Cc: Frank Pavlic <f.pavlic@kunbus.de>
Cc: Martin Sperl <kernel@martin.sperl.org>
Cc: Noralf Trønnes <noralf@tronnes.org>
---
 drivers/spi/spi-bcm2835.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

Comments

Stefan Wahren Nov. 9, 2018, 3:28 p.m. UTC | #1
Hi Lukas,

[add Vinod]

Am 08.11.18 um 08:06 schrieb Lukas Wunner:
> If a DMA transfer finishes orderly right when spi_transfer_one_message()
> determines that it has timed out, the callbacks bcm2835_spi_dma_done()
> and bcm2835_spi_handle_err() race to call dmaengine_terminate_all(),
> potentially leading to double termination.
>
> Prevent by atomically changing the dma_pending flag before calling
> dmaengine_terminate_all().
>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Fixes: 3ecd37edaa2a ("spi: bcm2835: enable dma modes for transfers meeting certain conditions")
> Cc: stable@vger.kernel.org # v4.2+
> Cc: Mathias Duckeck <m.duckeck@kunbus.de>
> Cc: Frank Pavlic <f.pavlic@kunbus.de>
> Cc: Martin Sperl <kernel@martin.sperl.org>
> Cc: Noralf Trønnes <noralf@tronnes.org>
> ---
>  drivers/spi/spi-bcm2835.c | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/spi/spi-bcm2835.c b/drivers/spi/spi-bcm2835.c
> index 594e9712ecbc..774161bbcb2e 100644
> --- a/drivers/spi/spi-bcm2835.c
> +++ b/drivers/spi/spi-bcm2835.c
> @@ -232,10 +232,9 @@ static void bcm2835_spi_dma_done(void *data)
>  	 * is called the tx-dma must have finished - can't get to this
>  	 * situation otherwise...
>  	 */
> -	dmaengine_terminate_all(master->dma_tx);
> -
> -	/* mark as no longer pending */
> -	bs->dma_pending = 0;
> +	if (cmpxchg(&bs->dma_pending, true, false)) {
> +		dmaengine_terminate_all(master->dma_tx);
> +	}
>  
>  	/* and mark as completed */;
>  	complete(&master->xfer_completion);
> @@ -617,10 +616,9 @@ static void bcm2835_spi_handle_err(struct spi_master *master,
>  	struct bcm2835_spi *bs = spi_master_get_devdata(master);
>  
>  	/* if an error occurred and we have an active dma, then terminate */
> -	if (bs->dma_pending) {
> +	if (cmpxchg(&bs->dma_pending, true, false)) {
>  		dmaengine_terminate_all(master->dma_tx);
>  		dmaengine_terminate_all(master->dma_rx);
> -		bs->dma_pending = 0;
>  	}
>  	/* and reset */
>  	bcm2835_spi_reset_hw(master);

i'm not sure but this doesn't look like the right way (tm) to me.

Btw according to dmaengine.h the usage of function
dmaengine_terminate_all is deprecated. Would be nice to get this fixed, too.

Sorry, for not providing a more helpful advice.

Regards
Stefan
Lukas Wunner Nov. 10, 2018, 2:29 p.m. UTC | #2
On Fri, Nov 09, 2018 at 04:28:48PM +0100, Stefan Wahren wrote:
> Am 08.11.18 um 08:06 schrieb Lukas Wunner:
> > If a DMA transfer finishes orderly right when spi_transfer_one_message()
> > determines that it has timed out, the callbacks bcm2835_spi_dma_done()
> > and bcm2835_spi_handle_err() race to call dmaengine_terminate_all(),
> > potentially leading to double termination.
> >
> > Prevent by atomically changing the dma_pending flag before calling
> > dmaengine_terminate_all().
> >
> > Signed-off-by: Lukas Wunner <lukas@wunner.de>
> > Fixes: 3ecd37edaa2a ("spi: bcm2835: enable dma modes for transfers meeting certain conditions")
> > Cc: stable@vger.kernel.org # v4.2+
> > ---
> >  drivers/spi/spi-bcm2835.c | 10 ++++------
> >  1 file changed, 4 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/spi/spi-bcm2835.c b/drivers/spi/spi-bcm2835.c
> > index 594e9712ecbc..774161bbcb2e 100644
> > --- a/drivers/spi/spi-bcm2835.c
> > +++ b/drivers/spi/spi-bcm2835.c
> > @@ -232,10 +232,9 @@ static void bcm2835_spi_dma_done(void *data)
> >  	 * is called the tx-dma must have finished - can't get to this
> >  	 * situation otherwise...
> >  	 */
> > -	dmaengine_terminate_all(master->dma_tx);
> > -
> > -	/* mark as no longer pending */
> > -	bs->dma_pending = 0;
> > +	if (cmpxchg(&bs->dma_pending, true, false)) {
> > +		dmaengine_terminate_all(master->dma_tx);
> > +	}
> >  
> >  	/* and mark as completed */;
> >  	complete(&master->xfer_completion);
> > @@ -617,10 +616,9 @@ static void bcm2835_spi_handle_err(struct spi_master *master,
> >  	struct bcm2835_spi *bs = spi_master_get_devdata(master);
> >  
> >  	/* if an error occurred and we have an active dma, then terminate */
> > -	if (bs->dma_pending) {
> > +	if (cmpxchg(&bs->dma_pending, true, false)) {
> >  		dmaengine_terminate_all(master->dma_tx);
> >  		dmaengine_terminate_all(master->dma_rx);
> > -		bs->dma_pending = 0;
> >  	}
> >  	/* and reset */
> >  	bcm2835_spi_reset_hw(master);
> 
> i'm not sure but this doesn't look like the right way (tm) to me.

Why?


> Btw according to dmaengine.h the usage of function
> dmaengine_terminate_all is deprecated. Would be nice to get this fixed, too.

Yes, migrating to dmaengine_terminate_sync() / _async() will prevent
the race addressed by the present patch in the DMA driver, relieving
DMA clients such as this SPI driver from that responsibility (see
commit message of b36f09c3c441).  It would obviate the need for the
dma_pending flag.

I agree that it is desirable to migrate to the new API but the point here
is to fix the existing code in patches small enough that they can be
backported to stable, not rework the driver to migrate to the new API.
(Which wouldn't be eligible for stable, and also, the new API was
introduced in v4.5 and the race in this SPI driver has existed since
v4.2, i.e. earlier.)

However I will have to respin this patch and change the type of the
dma_pending flag to unsigned int because cmpxchg() requires a 32-bit
argument on arches such as extensa and this patch breaks the build
on them for COMPILE_TEST=y.  This was reported by 0-day.

Thanks,

Lukas
Vinod Koul Nov. 11, 2018, 10:47 a.m. UTC | #3
On 10-11-18, 15:29, Lukas Wunner wrote:
> On Fri, Nov 09, 2018 at 04:28:48PM +0100, Stefan Wahren wrote:
> > Am 08.11.18 um 08:06 schrieb Lukas Wunner:
> > > If a DMA transfer finishes orderly right when spi_transfer_one_message()
> > > determines that it has timed out, the callbacks bcm2835_spi_dma_done()
> > > and bcm2835_spi_handle_err() race to call dmaengine_terminate_all(),
> > > potentially leading to double termination.
> > >
> > > Prevent by atomically changing the dma_pending flag before calling
> > > dmaengine_terminate_all().
> > >
> > > Signed-off-by: Lukas Wunner <lukas@wunner.de>
> > > Fixes: 3ecd37edaa2a ("spi: bcm2835: enable dma modes for transfers meeting certain conditions")
> > > Cc: stable@vger.kernel.org # v4.2+
> > > ---
> > >  drivers/spi/spi-bcm2835.c | 10 ++++------
> > >  1 file changed, 4 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/spi/spi-bcm2835.c b/drivers/spi/spi-bcm2835.c
> > > index 594e9712ecbc..774161bbcb2e 100644
> > > --- a/drivers/spi/spi-bcm2835.c
> > > +++ b/drivers/spi/spi-bcm2835.c
> > > @@ -232,10 +232,9 @@ static void bcm2835_spi_dma_done(void *data)
> > >  	 * is called the tx-dma must have finished - can't get to this
> > >  	 * situation otherwise...
> > >  	 */
> > > -	dmaengine_terminate_all(master->dma_tx);
> > > -
> > > -	/* mark as no longer pending */
> > > -	bs->dma_pending = 0;
> > > +	if (cmpxchg(&bs->dma_pending, true, false)) {
> > > +		dmaengine_terminate_all(master->dma_tx);
> > > +	}
> > >  
> > >  	/* and mark as completed */;
> > >  	complete(&master->xfer_completion);
> > > @@ -617,10 +616,9 @@ static void bcm2835_spi_handle_err(struct spi_master *master,
> > >  	struct bcm2835_spi *bs = spi_master_get_devdata(master);
> > >  
> > >  	/* if an error occurred and we have an active dma, then terminate */
> > > -	if (bs->dma_pending) {
> > > +	if (cmpxchg(&bs->dma_pending, true, false)) {
> > >  		dmaengine_terminate_all(master->dma_tx);
> > >  		dmaengine_terminate_all(master->dma_rx);
> > > -		bs->dma_pending = 0;
> > >  	}
> > >  	/* and reset */
> > >  	bcm2835_spi_reset_hw(master);
> > 
> > i'm not sure but this doesn't look like the right way (tm) to me.
> 
> Why?
> 
> 
> > Btw according to dmaengine.h the usage of function
> > dmaengine_terminate_all is deprecated. Would be nice to get this fixed, too.
> 
> Yes, migrating to dmaengine_terminate_sync() / _async() will prevent
> the race addressed by the present patch in the DMA driver, relieving
> DMA clients such as this SPI driver from that responsibility (see
> commit message of b36f09c3c441).  It would obviate the need for the
> dma_pending flag.
> 
> I agree that it is desirable to migrate to the new API but the point here
> is to fix the existing code in patches small enough that they can be
> backported to stable, not rework the driver to migrate to the new API.
> (Which wouldn't be eligible for stable, and also, the new API was
> introduced in v4.5 and the race in this SPI driver has existed since
> v4.2, i.e. earlier.)

I dont think porting the API introduction to stable would be an issue as
we fix things with those APIs

If you dont want that approach then why not submit this as fix to stable
pre 4.5 and for rest use the new APIs.

> However I will have to respin this patch and change the type of the
> dma_pending flag to unsigned int because cmpxchg() requires a 32-bit
> argument on arches such as extensa and this patch breaks the build on
> them for COMPILE_TEST=y.  This was reported by 0-day.
> 
> Thanks,
> 
> Lukas
Stefan Wahren Nov. 13, 2018, 8:21 a.m. UTC | #4
Hi Lukas,

Am 11.11.2018 um 11:47 schrieb Vinod Koul:
> On 10-11-18, 15:29, Lukas Wunner wrote:
>> On Fri, Nov 09, 2018 at 04:28:48PM +0100, Stefan Wahren wrote:
>>> Am 08.11.18 um 08:06 schrieb Lukas Wunner:
>>>> If a DMA transfer finishes orderly right when spi_transfer_one_message()
>>>> determines that it has timed out, the callbacks bcm2835_spi_dma_done()
>>>> and bcm2835_spi_handle_err() race to call dmaengine_terminate_all(),
>>>> potentially leading to double termination.
>>>>
>>>> Prevent by atomically changing the dma_pending flag before calling
>>>> dmaengine_terminate_all().
>>>>
>>>> Signed-off-by: Lukas Wunner <lukas@wunner.de>
>>>> Fixes: 3ecd37edaa2a ("spi: bcm2835: enable dma modes for transfers meeting certain conditions")
>>>> Cc: stable@vger.kernel.org # v4.2+
>>>> ---
>>>>  drivers/spi/spi-bcm2835.c | 10 ++++------
>>>>  1 file changed, 4 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/drivers/spi/spi-bcm2835.c b/drivers/spi/spi-bcm2835.c
>>>> index 594e9712ecbc..774161bbcb2e 100644
>>>> --- a/drivers/spi/spi-bcm2835.c
>>>> +++ b/drivers/spi/spi-bcm2835.c
>>>> @@ -232,10 +232,9 @@ static void bcm2835_spi_dma_done(void *data)
>>>>  	 * is called the tx-dma must have finished - can't get to this
>>>>  	 * situation otherwise...
>>>>  	 */
>>>> -	dmaengine_terminate_all(master->dma_tx);
>>>> -
>>>> -	/* mark as no longer pending */
>>>> -	bs->dma_pending = 0;
>>>> +	if (cmpxchg(&bs->dma_pending, true, false)) {
>>>> +		dmaengine_terminate_all(master->dma_tx);
>>>> +	}
>>>>  
>>>>  	/* and mark as completed */;
>>>>  	complete(&master->xfer_completion);
>>>> @@ -617,10 +616,9 @@ static void bcm2835_spi_handle_err(struct spi_master *master,
>>>>  	struct bcm2835_spi *bs = spi_master_get_devdata(master);
>>>>  
>>>>  	/* if an error occurred and we have an active dma, then terminate */
>>>> -	if (bs->dma_pending) {
>>>> +	if (cmpxchg(&bs->dma_pending, true, false)) {
>>>>  		dmaengine_terminate_all(master->dma_tx);
>>>>  		dmaengine_terminate_all(master->dma_rx);
>>>> -		bs->dma_pending = 0;
>>>>  	}
>>>>  	/* and reset */
>>>>  	bcm2835_spi_reset_hw(master);
>>> i'm not sure but this doesn't look like the right way (tm) to me.
>> Why?
>>
>>
>>> Btw according to dmaengine.h the usage of function
>>> dmaengine_terminate_all is deprecated. Would be nice to get this fixed, too.
>> Yes, migrating to dmaengine_terminate_sync() / _async() will prevent
>> the race addressed by the present patch in the DMA driver, relieving
>> DMA clients such as this SPI driver from that responsibility (see
>> commit message of b36f09c3c441).  It would obviate the need for the
>> dma_pending flag.
>>
>> I agree that it is desirable to migrate to the new API but the point here
>> is to fix the existing code in patches small enough that they can be
>> backported to stable, not rework the driver to migrate to the new API.
>> (Which wouldn't be eligible for stable, and also, the new API was
>> introduced in v4.5 and the race in this SPI driver has existed since
>> v4.2, i.e. earlier.)
> I dont think porting the API introduction to stable would be an issue as
> we fix things with those APIs
>
> If you dont want that approach then why not submit this as fix to stable
> pre 4.5 and for rest use the new APIs.

i assume your interested in 4.4 LTS and i prefer Vinod's suggestion.

Regards
Stefan

>
>> However I will have to respin this patch and change the type of the
>> dma_pending flag to unsigned int because cmpxchg() requires a 32-bit
>> argument on arches such as extensa and this patch breaks the build on
>> them for COMPILE_TEST=y.  This was reported by 0-day.
>>
>> Thanks,
>>
>> Lukas
diff mbox series

Patch

diff --git a/drivers/spi/spi-bcm2835.c b/drivers/spi/spi-bcm2835.c
index 594e9712ecbc..774161bbcb2e 100644
--- a/drivers/spi/spi-bcm2835.c
+++ b/drivers/spi/spi-bcm2835.c
@@ -232,10 +232,9 @@  static void bcm2835_spi_dma_done(void *data)
 	 * is called the tx-dma must have finished - can't get to this
 	 * situation otherwise...
 	 */
-	dmaengine_terminate_all(master->dma_tx);
-
-	/* mark as no longer pending */
-	bs->dma_pending = 0;
+	if (cmpxchg(&bs->dma_pending, true, false)) {
+		dmaengine_terminate_all(master->dma_tx);
+	}
 
 	/* and mark as completed */;
 	complete(&master->xfer_completion);
@@ -617,10 +616,9 @@  static void bcm2835_spi_handle_err(struct spi_master *master,
 	struct bcm2835_spi *bs = spi_master_get_devdata(master);
 
 	/* if an error occurred and we have an active dma, then terminate */
-	if (bs->dma_pending) {
+	if (cmpxchg(&bs->dma_pending, true, false)) {
 		dmaengine_terminate_all(master->dma_tx);
 		dmaengine_terminate_all(master->dma_rx);
-		bs->dma_pending = 0;
 	}
 	/* and reset */
 	bcm2835_spi_reset_hw(master);