mbox series

[v2,0/5] wcn36xx: Fix DMA buffer allocation and free logic

Message ID 20211018231722.873525-1-bryan.odonoghue@linaro.org (mailing list archive)
Headers show
Series wcn36xx: Fix DMA buffer allocation and free logic | expand

Message

Bryan O'Donoghue Oct. 18, 2021, 11:17 p.m. UTC
V2:
- Functionally decomposes the DXE reset in an additional patch.
  Since we call this logic more than once, it should be in a function.

- Leaves as-is the DXE reset write.

  Johannes Berg asked me if we are sure by the time the write to the reset
  register completes that DXE transactions will be suitably quiesced.

  The answer is:
  1. I believe these writes are non-posted writes
  2. Downstream doesn't poll for DXE reset completion

  So on #2 I have no real data for or against a polling operation, my tests
  indicate the reset indication in the register is atomic and as far as I
  can discern that also means DMA transactions are terminated.

V1:
Digging around through some bugs reported from an extensive testing cycle
we've found that wcn36xx has a number of unexplained RX related oopses.

In at least one case we appear to have DMA'd data to an unmapped region.
The written data appears to be a correctly formed DMA buffer descriptor - a
DXE BD in WCNSS parlance, with an AP beacon inside of it.

Reasoning about how such a situation might come about and reviewing the
run-time code, there was no obvious path where we might free a BD or an
skbuff pointed to by a BD, which DXE might not be aware of.

However looking at the ieee80211_ops.start and ieee80211_ops.stop callbacks
in wcn36xx we can see a number of bugs associated with BD allocation, error
handling and leaving the DMA engine active, despite freeing SKBs on the MSM
side.

This last mention - failure to quiesce potential DMA from the downstream
agent - WCNSS DXE despite freeing the memory @ the skbuffs is a decent
candidate for our unexplained upstream DMA transaction to unmapped memory.

Since wcn36xx_stop and wcn36xx_start can be called a number of times by the
controlling upper layers it means there is a potential gap between
wcn36xx_stop and wcn36xx_start which could leave WCNSS in a state where it
will try to DMA to memory we have freed.

This series addresses the obvious bugs that jump out on the start()/stop()
path.

Patch #1
  In order to make it easier to read the DXE code, I've moved all of the
  lock taking and freeing for DXE into dxe.c

Patch #2
  Fixes a very obviously broken channel enable/disable cycle

Patch #3
  Fixes a very obvious memory leak with dma_alloc_coherent()

Patch #4
  Makes sure before we release skbuffs which we assigned to the RX channels
  that we ensure the DXE block is put into reset

Bryan O'Donoghue (5):
  wcn36xx: Fix dxe lock layering violation
  wcn36xx: Fix DMA channel enable/disable cycle
  wcn36xx: Release DMA channel descriptor allocations
  wcn36xx: Functionally decompose DXE reset
  wcn36xx: Put DXE block into reset before freeing memory

 drivers/net/wireless/ath/wcn36xx/dxe.c  | 83 +++++++++++++++++++++----
 drivers/net/wireless/ath/wcn36xx/dxe.h  |  2 +
 drivers/net/wireless/ath/wcn36xx/txrx.c | 15 +----
 3 files changed, 74 insertions(+), 26 deletions(-)

Comments

Jeff Johnson Oct. 19, 2021, 3:59 p.m. UTC | #1
On 10/18/2021 4:17 PM, Bryan O'Donoghue wrote:
> V2:
> - Functionally decomposes the DXE reset in an additional patch.
>    Since we call this logic more than once, it should be in a function.
> 
> - Leaves as-is the DXE reset write.
> 
>    Johannes Berg asked me if we are sure by the time the write to the reset
>    register completes that DXE transactions will be suitably quiesced.
> 
>    The answer is:
>    1. I believe these writes are non-posted writes
>    2. Downstream doesn't poll for DXE reset completion
> 
>    So on #2 I have no real data for or against a polling operation, my tests
>    indicate the reset indication in the register is atomic and as far as I
>    can discern that also means DMA transactions are terminated.
> 
> V1:
> Digging around through some bugs reported from an extensive testing cycle
> we've found that wcn36xx has a number of unexplained RX related oopses.
> 
> In at least one case we appear to have DMA'd data to an unmapped region.
> The written data appears to be a correctly formed DMA buffer descriptor - a
> DXE BD in WCNSS parlance, with an AP beacon inside of it.
> 
> Reasoning about how such a situation might come about and reviewing the
> run-time code, there was no obvious path where we might free a BD or an
> skbuff pointed to by a BD, which DXE might not be aware of.
> 
> However looking at the ieee80211_ops.start and ieee80211_ops.stop callbacks
> in wcn36xx we can see a number of bugs associated with BD allocation, error
> handling and leaving the DMA engine active, despite freeing SKBs on the MSM
> side.
> 
> This last mention - failure to quiesce potential DMA from the downstream
> agent - WCNSS DXE despite freeing the memory @ the skbuffs is a decent
> candidate for our unexplained upstream DMA transaction to unmapped memory.
> 
> Since wcn36xx_stop and wcn36xx_start can be called a number of times by the
> controlling upper layers it means there is a potential gap between
> wcn36xx_stop and wcn36xx_start which could leave WCNSS in a state where it
> will try to DMA to memory we have freed.
> 
> This series addresses the obvious bugs that jump out on the start()/stop()
> path.
> 
> Patch #1
>    In order to make it easier to read the DXE code, I've moved all of the
>    lock taking and freeing for DXE into dxe.c
> 
> Patch #2
>    Fixes a very obviously broken channel enable/disable cycle
> 
> Patch #3
>    Fixes a very obvious memory leak with dma_alloc_coherent()
> 
> Patch #4
>    Makes sure before we release skbuffs which we assigned to the RX channels
>    that we ensure the DXE block is put into reset
> 
> Bryan O'Donoghue (5):
>    wcn36xx: Fix dxe lock layering violation
>    wcn36xx: Fix DMA channel enable/disable cycle
>    wcn36xx: Release DMA channel descriptor allocations
>    wcn36xx: Functionally decompose DXE reset
>    wcn36xx: Put DXE block into reset before freeing memory
> 
>   drivers/net/wireless/ath/wcn36xx/dxe.c  | 83 +++++++++++++++++++++----
>   drivers/net/wireless/ath/wcn36xx/dxe.h  |  2 +
>   drivers/net/wireless/ath/wcn36xx/txrx.c | 15 +----
>   3 files changed, 74 insertions(+), 26 deletions(-)
> 

Reviewed-by: Jeff Johnson <quic_jjohnson@quicinc.com>