diff mbox series

[net-next,7/7] net: marvell: prestera: fix port event handling on init

Message ID 20210203165458.28717-8-vadym.kochan@plvision.eu (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series Marvell Prestera Switchdev misc updates | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net-next
netdev/subject_prefix success Link
netdev/cc_maintainers warning 2 maintainers not CCed: tchornyi@marvell.com vkochan@marvell.com
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 9 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Vadym Kochan Feb. 3, 2021, 4:54 p.m. UTC
For some reason there might be a crash during ports creation if port
events are handling at the same time  because fw may send initial
port event with down state.

The crash points to cancel_delayed_work() which is called when port went
is down.  Currently I did not find out the real cause of the issue, so
fixed it by cancel port stats work only if previous port's state was up
& runnig.

The following is the crash which can be triggered:

[   28.311104] Unable to handle kernel paging request at virtual address
000071775f776600
[   28.319097] Mem abort info:
[   28.321914]   ESR = 0x96000004
[   28.324996]   EC = 0x25: DABT (current EL), IL = 32 bits
[   28.330350]   SET = 0, FnV = 0
[   28.333430]   EA = 0, S1PTW = 0
[   28.336597] Data abort info:
[   28.339499]   ISV = 0, ISS = 0x00000004
[   28.343362]   CM = 0, WnR = 0
[   28.346354] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000100bf7000
[   28.352842] [000071775f776600] pgd=0000000000000000,
p4d=0000000000000000
[   28.359695] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[   28.365310] Modules linked in: prestera_pci(+) prestera
uio_pdrv_genirq
[   28.372005] CPU: 0 PID: 1291 Comm: kworker/0:1H Not tainted
5.11.0-rc4 #1
[   28.378846] Hardware name: DNI AmazonGo1 A7040 board (DT)
[   28.384283] Workqueue: prestera_fw_wq prestera_fw_evt_work_fn
[prestera_pci]
[   28.391413] pstate: 60000085 (nZCv daIf -PAN -UAO -TCO BTYPE=--)
[   28.397468] pc : get_work_pool+0x48/0x60
[   28.401442] lr : try_to_grab_pending+0x6c/0x1b0
[   28.406018] sp : ffff80001391bc60
[   28.409358] x29: ffff80001391bc60 x28: 0000000000000000
[   28.414725] x27: ffff000104fc8b40 x26: ffff80001127de88
[   28.420089] x25: 0000000000000000 x24: ffff000106119760
[   28.425452] x23: ffff00010775dd60 x22: ffff00010567e000
[   28.430814] x21: 0000000000000000 x20: ffff80001391bcb0
[   28.436175] x19: ffff00010775deb8 x18: 00000000000000c0
[   28.441537] x17: 0000000000000000 x16: 000000008d9b0e88
[   28.446898] x15: 0000000000000001 x14: 00000000000002ba
[   28.452261] x13: 80a3002c00000002 x12: 00000000000005f4
[   28.457622] x11: 0000000000000030 x10: 000000000000000c
[   28.462985] x9 : 000000000000000c x8 : 0000000000000030
[   28.468346] x7 : ffff800014400000 x6 : ffff000106119758
[   28.473708] x5 : 0000000000000003 x4 : ffff00010775dc60
[   28.479068] x3 : 0000000000000000 x2 : 0000000000000060
[   28.484429] x1 : 000071775f776600 x0 : ffff00010775deb8
[   28.489791] Call trace:
[   28.492259]  get_work_pool+0x48/0x60
[   28.495874]  cancel_delayed_work+0x38/0xb0
[   28.500011]  prestera_port_handle_event+0x90/0xa0 [prestera]
[   28.505743]  prestera_evt_recv+0x98/0xe0 [prestera]
[   28.510683]  prestera_fw_evt_work_fn+0x180/0x228 [prestera_pci]
[   28.516660]  process_one_work+0x1e8/0x360
[   28.520710]  worker_thread+0x44/0x480
[   28.524412]  kthread+0x154/0x160
[   28.527670]  ret_from_fork+0x10/0x38
[   28.531290] Code: a8c17bfd d50323bf d65f03c0 9278dc21 (f9400020)
[   28.537429] ---[ end trace 5eced933df3a080b ]---

Signed-off-by: Vadym Kochan <vadym.kochan@plvision.eu>
---
 drivers/net/ethernet/marvell/prestera/prestera_main.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Jakub Kicinski Feb. 5, 2021, 5:19 a.m. UTC | #1
On Wed,  3 Feb 2021 18:54:58 +0200 Vadym Kochan wrote:
> For some reason there might be a crash during ports creation if port
> events are handling at the same time  because fw may send initial
> port event with down state.
> 
> The crash points to cancel_delayed_work() which is called when port went
> is down.  Currently I did not find out the real cause of the issue, so
> fixed it by cancel port stats work only if previous port's state was up
> & runnig.

Maybe you just need to move the DELAYED_WORK_INIT() earlier?
Not sure why it's at the end of prestera_port_create(), it 
just initializes some fields.

> [   28.489791] Call trace:
> [   28.492259]  get_work_pool+0x48/0x60
> [   28.495874]  cancel_delayed_work+0x38/0xb0
> [   28.500011]  prestera_port_handle_event+0x90/0xa0 [prestera]
> [   28.505743]  prestera_evt_recv+0x98/0xe0 [prestera]
> [   28.510683]  prestera_fw_evt_work_fn+0x180/0x228 [prestera_pci]
> [   28.516660]  process_one_work+0x1e8/0x360
> [   28.520710]  worker_thread+0x44/0x480
> [   28.524412]  kthread+0x154/0x160
> [   28.527670]  ret_from_fork+0x10/0x38
> [   28.531290] Code: a8c17bfd d50323bf d65f03c0 9278dc21 (f9400020)
> [   28.537429] ---[ end trace 5eced933df3a080b ]---
> 
> Signed-off-by: Vadym Kochan <vadym.kochan@plvision.eu>
> ---
>  drivers/net/ethernet/marvell/prestera/prestera_main.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/marvell/prestera/prestera_main.c b/drivers/net/ethernet/marvell/prestera/prestera_main.c
> index 39465e65d09b..122324dae47d 100644
> --- a/drivers/net/ethernet/marvell/prestera/prestera_main.c
> +++ b/drivers/net/ethernet/marvell/prestera/prestera_main.c
> @@ -433,7 +433,8 @@ static void prestera_port_handle_event(struct prestera_switch *sw,
>  			netif_carrier_on(port->dev);
>  			if (!delayed_work_pending(caching_dw))
>  				queue_delayed_work(prestera_wq, caching_dw, 0);
> -		} else {
> +		} else if (netif_running(port->dev) &&
> +			   netif_carrier_ok(port->dev)) {
>  			netif_carrier_off(port->dev);
>  			if (delayed_work_pending(caching_dw))
>  				cancel_delayed_work(caching_dw);
Vadym Kochan Feb. 5, 2021, 12:31 p.m. UTC | #2
Hi Jakub,

On Thu, Feb 04, 2021 at 09:19:34PM -0800, Jakub Kicinski wrote:
> On Wed,  3 Feb 2021 18:54:58 +0200 Vadym Kochan wrote:
> > For some reason there might be a crash during ports creation if port
> > events are handling at the same time  because fw may send initial
> > port event with down state.
> > 
> > The crash points to cancel_delayed_work() which is called when port went
> > is down.  Currently I did not find out the real cause of the issue, so
> > fixed it by cancel port stats work only if previous port's state was up
> > & runnig.
> 
> Maybe you just need to move the DELAYED_WORK_INIT() earlier?
> Not sure why it's at the end of prestera_port_create(), it 
> just initializes some fields.
> 

Thanks for suggestion, but it does not help. Will try to find-out the
real root cause but this is the only fix I 'v came up.

> > [   28.489791] Call trace:
> > [   28.492259]  get_work_pool+0x48/0x60
> > [   28.495874]  cancel_delayed_work+0x38/0xb0
> > [   28.500011]  prestera_port_handle_event+0x90/0xa0 [prestera]
> > [   28.505743]  prestera_evt_recv+0x98/0xe0 [prestera]
> > [   28.510683]  prestera_fw_evt_work_fn+0x180/0x228 [prestera_pci]
> > [   28.516660]  process_one_work+0x1e8/0x360
> > [   28.520710]  worker_thread+0x44/0x480
> > [   28.524412]  kthread+0x154/0x160
> > [   28.527670]  ret_from_fork+0x10/0x38
> > [   28.531290] Code: a8c17bfd d50323bf d65f03c0 9278dc21 (f9400020)
> > [   28.537429] ---[ end trace 5eced933df3a080b ]---
> > 
> > Signed-off-by: Vadym Kochan <vadym.kochan@plvision.eu>
> > ---
> >  drivers/net/ethernet/marvell/prestera/prestera_main.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/net/ethernet/marvell/prestera/prestera_main.c b/drivers/net/ethernet/marvell/prestera/prestera_main.c
> > index 39465e65d09b..122324dae47d 100644
> > --- a/drivers/net/ethernet/marvell/prestera/prestera_main.c
> > +++ b/drivers/net/ethernet/marvell/prestera/prestera_main.c
> > @@ -433,7 +433,8 @@ static void prestera_port_handle_event(struct prestera_switch *sw,
> >  			netif_carrier_on(port->dev);
> >  			if (!delayed_work_pending(caching_dw))
> >  				queue_delayed_work(prestera_wq, caching_dw, 0);
> > -		} else {
> > +		} else if (netif_running(port->dev) &&
> > +			   netif_carrier_ok(port->dev)) {
> >  			netif_carrier_off(port->dev);
> >  			if (delayed_work_pending(caching_dw))
> >  				cancel_delayed_work(caching_dw);
>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/marvell/prestera/prestera_main.c b/drivers/net/ethernet/marvell/prestera/prestera_main.c
index 39465e65d09b..122324dae47d 100644
--- a/drivers/net/ethernet/marvell/prestera/prestera_main.c
+++ b/drivers/net/ethernet/marvell/prestera/prestera_main.c
@@ -433,7 +433,8 @@  static void prestera_port_handle_event(struct prestera_switch *sw,
 			netif_carrier_on(port->dev);
 			if (!delayed_work_pending(caching_dw))
 				queue_delayed_work(prestera_wq, caching_dw, 0);
-		} else {
+		} else if (netif_running(port->dev) &&
+			   netif_carrier_ok(port->dev)) {
 			netif_carrier_off(port->dev);
 			if (delayed_work_pending(caching_dw))
 				cancel_delayed_work(caching_dw);