Message ID | 727eb5ffd3c7c805245e512da150ecf0a7154020.1659452909.git.deren.wu@mediatek.com (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Felix Fietkau |
Headers | show |
Series | mt76: mt7921e: fix crash in chip reset fail | expand |
Hi Kalle, If the patch looks good to you, could you help apply the patch to wireless-drivers.git because there are getting more users reporting the issue with the stable kernel such as [1]. I would like to backport it earlier once it appears in the Linus tree to solve the indefinite hang issue. [1] https://lore.kernel.org/linux-wireless/VE1PR04MB64945C660A81D38F290E4A4BE59F9@VE1PR04MB6494.eurprd04.prod.outlook.com/T/ Sean On Tue, Aug 2, 2022 at 8:20 AM Deren Wu <Deren.Wu@mediatek.com> wrote: > > From: Deren Wu <deren.wu@mediatek.com> > > In case of drv own fail in reset, we may need to run mac_reset several > times. The sequence would trigger system crash as the log below. > > Because we do not re-enable/schedule "tx_napi" before disable it again, > the process would keep waiting for state change in napi_diable(). To > avoid the problem and keep status synchronize for each run, goto final > resource handling if drv own failed. > > [ 5857.353423] mt7921e 0000:3b:00.0: driver own failed > [ 5858.433427] mt7921e 0000:3b:00.0: Timeout for driver own > [ 5859.633430] mt7921e 0000:3b:00.0: driver own failed > [ 5859.633444] ------------[ cut here ]------------ > [ 5859.633446] WARNING: CPU: 6 at kernel/kthread.c:659 kthread_park+0x11d > [ 5859.633717] Workqueue: mt76 mt7921_mac_reset_work [mt7921_common] > [ 5859.633728] RIP: 0010:kthread_park+0x11d/0x150 > [ 5859.633736] RSP: 0018:ffff8881b676fc68 EFLAGS: 00010202 > ...... > [ 5859.633766] Call Trace: > [ 5859.633768] <TASK> > [ 5859.633771] mt7921e_mac_reset+0x176/0x6f0 [mt7921e] > [ 5859.633778] mt7921_mac_reset_work+0x184/0x3a0 [mt7921_common] > [ 5859.633785] ? mt7921_mac_set_timing+0x520/0x520 [mt7921_common] > [ 5859.633794] ? __kasan_check_read+0x11/0x20 > [ 5859.633802] process_one_work+0x7ee/0x1320 > [ 5859.633810] worker_thread+0x53c/0x1240 > [ 5859.633818] kthread+0x2b8/0x370 > [ 5859.633824] ? process_one_work+0x1320/0x1320 > [ 5859.633828] ? kthread_complete_and_exit+0x30/0x30 > [ 5859.633834] ret_from_fork+0x1f/0x30 > [ 5859.633842] </TASK> > > Fixes: 0efaf31dec57 ("mt76: mt7921: fix MT7921E reset failure") > Signed-off-by: Deren Wu <deren.wu@mediatek.com> > --- > drivers/net/wireless/mediatek/mt76/mt7921/pci_mac.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci_mac.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci_mac.c > index e1800674089a..576a0149251b 100644 > --- a/drivers/net/wireless/mediatek/mt76/mt7921/pci_mac.c > +++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci_mac.c > @@ -261,7 +261,7 @@ int mt7921e_mac_reset(struct mt7921_dev *dev) > > err = mt7921e_driver_own(dev); > if (err) > - return err; > + goto out; > > err = mt7921_run_firmware(dev); > if (err) > -- > 2.18.0 >
Hi Johannes, Kalle seemed not available this week, so I would like to look for help from you. If the patch looks good to you, could you help apply the patch to wireless-drivers.git because there are getting more users reporting the issue with the stable kernel such as [1]. I would like to backport it sooner once it appears in the Linus tree to solve the indefinite hang issue. Sorry for the hurry request, I knew you just sent the pull request one moment ago :( [1] https://lore.kernel.org/linux-wireless/VE1PR04MB64945C660A81D38F290E4A4BE59F9@VE1PR04MB6494.eurprd04.prod.outlook.com/T/ Sean On Wed, Aug 24, 2022 at 6:45 PM sean wang <objelf@gmail.com> wrote: > > Hi Kalle, > > If the patch looks good to you, could you help apply the patch to > wireless-drivers.git because there are getting more users reporting > the issue with the stable kernel such as [1]. I would like to backport > it earlier once it appears in the Linus tree to solve the indefinite > hang issue. > > [1] https://lore.kernel.org/linux-wireless/VE1PR04MB64945C660A81D38F290E4A4BE59F9@VE1PR04MB6494.eurprd04.prod.outlook.com/T/ > > Sean > > On Tue, Aug 2, 2022 at 8:20 AM Deren Wu <Deren.Wu@mediatek.com> wrote: > > > > From: Deren Wu <deren.wu@mediatek.com> > > > > In case of drv own fail in reset, we may need to run mac_reset several > > times. The sequence would trigger system crash as the log below. > > > > Because we do not re-enable/schedule "tx_napi" before disable it again, > > the process would keep waiting for state change in napi_diable(). To > > avoid the problem and keep status synchronize for each run, goto final > > resource handling if drv own failed. > > > > [ 5857.353423] mt7921e 0000:3b:00.0: driver own failed > > [ 5858.433427] mt7921e 0000:3b:00.0: Timeout for driver own > > [ 5859.633430] mt7921e 0000:3b:00.0: driver own failed > > [ 5859.633444] ------------[ cut here ]------------ > > [ 5859.633446] WARNING: CPU: 6 at kernel/kthread.c:659 kthread_park+0x11d > > [ 5859.633717] Workqueue: mt76 mt7921_mac_reset_work [mt7921_common] > > [ 5859.633728] RIP: 0010:kthread_park+0x11d/0x150 > > [ 5859.633736] RSP: 0018:ffff8881b676fc68 EFLAGS: 00010202 > > ...... > > [ 5859.633766] Call Trace: > > [ 5859.633768] <TASK> > > [ 5859.633771] mt7921e_mac_reset+0x176/0x6f0 [mt7921e] > > [ 5859.633778] mt7921_mac_reset_work+0x184/0x3a0 [mt7921_common] > > [ 5859.633785] ? mt7921_mac_set_timing+0x520/0x520 [mt7921_common] > > [ 5859.633794] ? __kasan_check_read+0x11/0x20 > > [ 5859.633802] process_one_work+0x7ee/0x1320 > > [ 5859.633810] worker_thread+0x53c/0x1240 > > [ 5859.633818] kthread+0x2b8/0x370 > > [ 5859.633824] ? process_one_work+0x1320/0x1320 > > [ 5859.633828] ? kthread_complete_and_exit+0x30/0x30 > > [ 5859.633834] ret_from_fork+0x1f/0x30 > > [ 5859.633842] </TASK> > > > > Fixes: 0efaf31dec57 ("mt76: mt7921: fix MT7921E reset failure") > > Signed-off-by: Deren Wu <deren.wu@mediatek.com> > > --- > > drivers/net/wireless/mediatek/mt76/mt7921/pci_mac.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci_mac.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci_mac.c > > index e1800674089a..576a0149251b 100644 > > --- a/drivers/net/wireless/mediatek/mt76/mt7921/pci_mac.c > > +++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci_mac.c > > @@ -261,7 +261,7 @@ int mt7921e_mac_reset(struct mt7921_dev *dev) > > > > err = mt7921e_driver_own(dev); > > if (err) > > - return err; > > + goto out; > > > > err = mt7921_run_firmware(dev); > > if (err) > > -- > > 2.18.0 > >
Sean Wang <sean.wang@kernel.org> writes: > Kalle seemed not available this week, so I would like to look for help from you. > If the patch looks good to you, could you help apply the patch to > wireless-drivers.git because there are getting more users reporting > the issue with the stable kernel such as [1]. I would like to backport > it sooner once it appears in the Linus tree to solve the indefinite > hang issue. Sorry for the hurry request, I knew you just sent the pull > request one moment ago :( > > [1] > https://lore.kernel.org/linux-wireless/VE1PR04MB64945C660A81D38F290E4A4BE59F9@VE1PR04MB6494.eurprd04.prod.outlook.com/T/ Johannes applied this now: https://git.kernel.org/wireless/wireless/c/fa3fbe640378
diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci_mac.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci_mac.c index e1800674089a..576a0149251b 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/pci_mac.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci_mac.c @@ -261,7 +261,7 @@ int mt7921e_mac_reset(struct mt7921_dev *dev) err = mt7921e_driver_own(dev); if (err) - return err; + goto out; err = mt7921_run_firmware(dev); if (err)