diff mbox series

mt76: fix frag length allocation for usb

Message ID 1538554631-5168-1-git-send-email-sgruszka@redhat.com (mailing list archive)
State Changes Requested
Delegated to: Kalle Valo
Headers show
Series mt76: fix frag length allocation for usb | expand

Commit Message

Stanislaw Gruszka Oct. 3, 2018, 8:17 a.m. UTC
This is correct fix for c12128ce44b0 ("mt76: use a per rx queue page
fragment cache"). We use wrong length when we allocate segments for
MCU transmissions, which require bigger segment size than e->buf_size.

Commit 481bb0432414 ("mt76: usb: make rx page_frag_cache access atomic")
partially solved the problem or actually mask it by changing
mt76u_mcu_init_rx() and mt76u_alloc_queues() sequence, so e->buf_size
become non zero any longer, but still not big enough to handle MCU data.

Patch fixes memory corruption which can manifest itself as random,
not easy to reproduce crashes, during mt76 driver load or unload.

Fixes: c12128ce44b0 ("mt76: use a per rx queue page fragment cache")
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
---
 drivers/net/wireless/mediatek/mt76/usb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Lorenzo Bianconi Oct. 3, 2018, 9:12 a.m. UTC | #1
> This is correct fix for c12128ce44b0 ("mt76: use a per rx queue page
> fragment cache"). We use wrong length when we allocate segments for
> MCU transmissions, which require bigger segment size than e->buf_size.
> 
> Commit 481bb0432414 ("mt76: usb: make rx page_frag_cache access atomic")
> partially solved the problem or actually mask it by changing
> mt76u_mcu_init_rx() and mt76u_alloc_queues() sequence, so e->buf_size
> become non zero any longer, but still not big enough to handle MCU data.

Hi Stanislaw,

I agree that we should use len in page_frag_alloc() instead of q->buf_size, so

Acked-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>

but reviewing the code I guess the real issue is not q->buf_size (since it should
be bigger than MCU_RESP_URB_SIZE) but it is the sequence of calls in
mt76x0u_register_device() since mt76u_alloc_queues need to be called before
mt76u_mcu_init_rx()

Regards,
Lorenzo

> 
> Patch fixes memory corruption which can manifest itself as random,
> not easy to reproduce crashes, during mt76 driver load or unload.
> 
> Fixes: c12128ce44b0 ("mt76: use a per rx queue page fragment cache")
> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
> ---
>  drivers/net/wireless/mediatek/mt76/usb.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wireless/mediatek/mt76/usb.c b/drivers/net/wireless/mediatek/mt76/usb.c
> index de7785c4f6af..6b643ea701e3 100644
> --- a/drivers/net/wireless/mediatek/mt76/usb.c
> +++ b/drivers/net/wireless/mediatek/mt76/usb.c
> @@ -286,7 +286,7 @@ mt76u_fill_rx_sg(struct mt76_dev *dev, struct mt76u_buf *buf,
>  		void *data;
>  		int offset;
>  
> -		data = page_frag_alloc(&q->rx_page, q->buf_size, GFP_ATOMIC);
> +		data = page_frag_alloc(&q->rx_page, len, GFP_ATOMIC);
>  		if (!data)
>  			break;
>  
> -- 
> 2.7.5
>
Stanislaw Gruszka Oct. 3, 2018, 10:19 a.m. UTC | #2
On Wed, Oct 03, 2018 at 11:12:07AM +0200, Lorenzo Bianconi wrote:
> > This is correct fix for c12128ce44b0 ("mt76: use a per rx queue page
> > fragment cache"). We use wrong length when we allocate segments for
> > MCU transmissions, which require bigger segment size than e->buf_size.
> > 
> > Commit 481bb0432414 ("mt76: usb: make rx page_frag_cache access atomic")
> > partially solved the problem or actually mask it by changing
> > mt76u_mcu_init_rx() and mt76u_alloc_queues() sequence, so e->buf_size
> > become non zero any longer, but still not big enough to handle MCU data.
> 
> Hi Stanislaw,
> 
> I agree that we should use len in page_frag_alloc() instead of q->buf_size, so
> 
> Acked-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
> 
> but reviewing the code I guess the real issue is not q->buf_size (since it should
> be bigger than MCU_RESP_URB_SIZE) but it is the sequence of calls in

I added printk and there are allocations where len is bigger then
q->buf_size even with correct mt76u_alloc_queues, mt76u_mcu_init_rx
sequence for mt76x0u:

[16426.606090] q->buf_size 2048 len 2048 nsgs 8sglen 1728
[16426.606131] q->buf_size 2048 len 2048 nsgs 8sglen 1728
[16426.606134] q->buf_size 2048 len 2048 nsgs 8sglen 1728
<snip>
[16426.606464] q->buf_size 2048 len 2048 nsgs 8sglen 1728
[16426.607517] q->buf_size 2048 len 1024 nsgs 1sglen 1024
[16426.939268] q->buf_size 2048 len 14584 nsgs 1sglen 14584
[16426.984955] q->buf_size 2048 len 14584 nsgs 1sglen 14584

Not sure where it come from, but it's after MCU init (which is 1024
third line from end).

> mt76x0u_register_device() since mt76u_alloc_queues need to be called before
> mt76u_mcu_init_rx()

Ok, so this was already fixed in 

commit 481bb0432414f790066205fe77226b7d1877385d
Author: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Date:   Wed Sep 26 13:07:39 2018 +0200

    mt76: usb: make rx page_frag_cache access atomic

but then the sequence was changed again in

commit faa605bdfaa1322ea8e85791abdb3382a8cb4e0c
Author: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Date:   Fri Sep 28 13:39:00 2018 +0200

    mt76x0: usb: move initialization code in usb.c

Thanks
Stanislaw
Lorenzo Bianconi Oct. 3, 2018, 10:31 a.m. UTC | #3
> > 
> > Hi Stanislaw,
> > 
> > I agree that we should use len in page_frag_alloc() instead of q->buf_size, so
> > 
> > Acked-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
> > 
> > but reviewing the code I guess the real issue is not q->buf_size (since it should
> > be bigger than MCU_RESP_URB_SIZE) but it is the sequence of calls in
> 
> I added printk and there are allocations where len is bigger then
> q->buf_size even with correct mt76u_alloc_queues, mt76u_mcu_init_rx
> sequence for mt76x0u:
> 
> [16426.606090] q->buf_size 2048 len 2048 nsgs 8sglen 1728
> [16426.606131] q->buf_size 2048 len 2048 nsgs 8sglen 1728
> [16426.606134] q->buf_size 2048 len 2048 nsgs 8sglen 1728
> <snip>
> [16426.606464] q->buf_size 2048 len 2048 nsgs 8sglen 1728
> [16426.607517] q->buf_size 2048 len 1024 nsgs 1sglen 1024
> [16426.939268] q->buf_size 2048 len 14584 nsgs 1sglen 14584
> [16426.984955] q->buf_size 2048 len 14584 nsgs 1sglen 14584
> 
> Not sure where it come from, but it's after MCU init (which is 1024
> third line from end).
> 
> > mt76x0u_register_device() since mt76u_alloc_queues need to be called before
> > mt76u_mcu_init_rx()
> 
> Ok, so this was already fixed in 
> 
> commit 481bb0432414f790066205fe77226b7d1877385d
> Author: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
> Date:   Wed Sep 26 13:07:39 2018 +0200
> 
>     mt76: usb: make rx page_frag_cache access atomic
> 
> but then the sequence was changed again in
> 
> commit faa605bdfaa1322ea8e85791abdb3382a8cb4e0c
> Author: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
> Date:   Fri Sep 28 13:39:00 2018 +0200
> 
>     mt76x0: usb: move initialization code in usb.c

That is an issue introduce in this commit, I am working on a fix.
Probably I got also the other reported issue. I will send a series to
test soon. Thanks.

Regards,
Lorenzo

> 
> Thanks
> Stanislaw
>
diff mbox series

Patch

diff --git a/drivers/net/wireless/mediatek/mt76/usb.c b/drivers/net/wireless/mediatek/mt76/usb.c
index de7785c4f6af..6b643ea701e3 100644
--- a/drivers/net/wireless/mediatek/mt76/usb.c
+++ b/drivers/net/wireless/mediatek/mt76/usb.c
@@ -286,7 +286,7 @@  mt76u_fill_rx_sg(struct mt76_dev *dev, struct mt76u_buf *buf,
 		void *data;
 		int offset;
 
-		data = page_frag_alloc(&q->rx_page, q->buf_size, GFP_ATOMIC);
+		data = page_frag_alloc(&q->rx_page, len, GFP_ATOMIC);
 		if (!data)
 			break;