diff mbox series

[v5,2/2] mwifiex: fix sleep in atomic context bugs caused by dev_coredumpv

Message ID 54f886c2fce5948a8743b9de65d36ec3e8adfaf1.1654229964.git.duoming@zju.edu.cn (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series Remove useless param of devcoredump functions and fix bugs | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Duoming Zhou June 3, 2022, 5:09 a.m. UTC
There are sleep in atomic context bugs when uploading device dump
data in mwifiex. The root cause is that dev_coredumpv could not
be used in atomic contexts, because it calls dev_set_name which
include operations that may sleep. The call tree shows execution
paths that could lead to bugs:

   (Interrupt context)
fw_dump_timer_fn
  mwifiex_upload_device_dump
    dev_coredumpv(..., GFP_KERNEL)
      dev_coredumpm()
        kzalloc(sizeof(*devcd), gfp); //may sleep
        dev_set_name
          kobject_set_name_vargs
            kvasprintf_const(GFP_KERNEL, ...); //may sleep
            kstrdup(s, GFP_KERNEL); //may sleep

The corresponding fail log is shown below:

[  135.275938] usb 1-1: == mwifiex dump information to /sys/class/devcoredump start
[  135.281029] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:265
...
[  135.293613] Call Trace:
[  135.293613]  <IRQ>
[  135.293613]  dump_stack_lvl+0x57/0x7d
[  135.293613]  __might_resched.cold+0x138/0x173
[  135.293613]  ? dev_coredumpm+0xca/0x2e0
[  135.293613]  kmem_cache_alloc_trace+0x189/0x1f0
[  135.293613]  ? devcd_match_failing+0x30/0x30
[  135.293613]  dev_coredumpm+0xca/0x2e0
[  135.293613]  ? devcd_freev+0x10/0x10
[  135.293613]  dev_coredumpv+0x1c/0x20
[  135.293613]  ? devcd_match_failing+0x30/0x30
[  135.293613]  mwifiex_upload_device_dump+0x65/0xb0
[  135.293613]  ? mwifiex_dnld_fw+0x1b0/0x1b0
[  135.293613]  call_timer_fn+0x122/0x3d0
[  135.293613]  ? msleep_interruptible+0xb0/0xb0
[  135.293613]  ? lock_downgrade+0x3c0/0x3c0
[  135.293613]  ? __next_timer_interrupt+0x13c/0x160
[  135.293613]  ? lockdep_hardirqs_on_prepare+0xe/0x220
[  135.293613]  ? mwifiex_dnld_fw+0x1b0/0x1b0
[  135.293613]  __run_timers.part.0+0x3f8/0x540
[  135.293613]  ? call_timer_fn+0x3d0/0x3d0
[  135.293613]  ? arch_restore_msi_irqs+0x10/0x10
[  135.293613]  ? lapic_next_event+0x31/0x40
[  135.293613]  run_timer_softirq+0x4f/0xb0
[  135.293613]  __do_softirq+0x1c2/0x651
...
[  135.293613] RIP: 0010:default_idle+0xb/0x10
[  135.293613] RSP: 0018:ffff888006317e68 EFLAGS: 00000246
[  135.293613] RAX: ffffffff82ad8d10 RBX: ffff888006301cc0 RCX: ffffffff82ac90e1
[  135.293613] RDX: ffffed100d9ff1b4 RSI: ffffffff831ad140 RDI: ffffffff82ad8f20
[  135.293613] RBP: 0000000000000003 R08: 0000000000000000 R09: ffff88806cff8d9b
[  135.293613] R10: ffffed100d9ff1b3 R11: 0000000000000001 R12: ffffffff84593410
[  135.293613] R13: 0000000000000000 R14: 0000000000000000 R15: 1ffff11000c62fd2
...
[  135.389205] usb 1-1: == mwifiex dump information to /sys/class/devcoredump end

This patch uses delayed work to replace timer and moves the operations
that may sleep into a delayed work in order to mitigate bugs, it was
tested on Marvell 88W8801 chip whose port is usb and the firmware is
usb8801_uapsta.bin. The following is the result after using delayed
work to replace timer.

[  134.936453] usb 1-1: == mwifiex dump information to /sys/class/devcoredump start
[  135.043344] usb 1-1: == mwifiex dump information to /sys/class/devcoredump end

As we can see, there is no bug now.

Fixes: f5ecd02a8b20 ("mwifiex: device dump support for usb interface")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
---
Changes in v5:
  - Use delayed work to replace timer.

 drivers/net/wireless/marvell/mwifiex/init.c      | 10 ++++++----
 drivers/net/wireless/marvell/mwifiex/main.h      |  2 +-
 drivers/net/wireless/marvell/mwifiex/sta_event.c |  6 +++---
 3 files changed, 10 insertions(+), 8 deletions(-)

Comments

Brian Norris June 6, 2022, 6:14 p.m. UTC | #1
On Fri, Jun 03, 2022 at 01:09:35PM +0800, Duoming Zhou wrote:
> There are sleep in atomic context bugs when uploading device dump
> data in mwifiex. The root cause is that dev_coredumpv could not
> be used in atomic contexts, because it calls dev_set_name which
> include operations that may sleep. The call tree shows execution
> paths that could lead to bugs:
...
> Fixes: f5ecd02a8b20 ("mwifiex: device dump support for usb interface")
> Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
> ---
> Changes in v5:
>   - Use delayed work to replace timer.
> 
>  drivers/net/wireless/marvell/mwifiex/init.c      | 10 ++++++----
>  drivers/net/wireless/marvell/mwifiex/main.h      |  2 +-
>  drivers/net/wireless/marvell/mwifiex/sta_event.c |  6 +++---
>  3 files changed, 10 insertions(+), 8 deletions(-)

Looks great! Thanks for working on this.

Reviewed-by: Brian Norris <briannorris@chromium.org>

Some small nitpicks below, but they're definitely not critical.

> diff --git a/drivers/net/wireless/marvell/mwifiex/init.c b/drivers/net/wireless/marvell/mwifiex/init.c
> index 88c72d1827a..3713f3e323f 100644
> --- a/drivers/net/wireless/marvell/mwifiex/init.c
> +++ b/drivers/net/wireless/marvell/mwifiex/init.c
> @@ -63,9 +63,11 @@ static void wakeup_timer_fn(struct timer_list *t)
>  		adapter->if_ops.card_reset(adapter);
>  }
>  
> -static void fw_dump_timer_fn(struct timer_list *t)
> +static void fw_dump_work(struct work_struct *work)
>  {
> -	struct mwifiex_adapter *adapter = from_timer(adapter, t, devdump_timer);
> +	struct mwifiex_adapter *adapter = container_of(work,
> +					struct mwifiex_adapter,
> +					devdump_work.work);

Super nitpicky: the hanging indent style seems a bit off. I typically
see people try to align to the first character after the parenthesis,
like:

	struct mwifiex_adapter *adapter = container_of(work,
						       struct mwifiex_adapter,
						       devdump_work.work);

It's not a clearly-specified style rule I think, so I definitely
wouldn't insist.

On the bright side: I think the clang-format rules (in .clang-format)
are getting better, so one can make some formatting decisions via tools
instead of opinion and close reading! Unfortunately, we probably can't
do that extensively and automatically, because I doubt people will love
all the reformatting because of all the existing inconsistent style.

Anyway, to cut to the chase: clang-format chooses moving to a new line:

	struct mwifiex_adapter *adapter =
		container_of(work, struct mwifiex_adapter, devdump_work.work);

More info if you're interested:
https://www.kernel.org/doc/html/latest/process/clang-format.html

>  
>  	mwifiex_upload_device_dump(adapter);
>  }

...

> diff --git a/drivers/net/wireless/marvell/mwifiex/main.h b/drivers/net/wireless/marvell/mwifiex/main.h
> index 332dd1c8db3..6530c6ee308 100644
> --- a/drivers/net/wireless/marvell/mwifiex/main.h
> +++ b/drivers/net/wireless/marvell/mwifiex/main.h
> @@ -1055,7 +1055,7 @@ struct mwifiex_adapter {

Nitpick: main.h is probably missing a lot of #includes, but you could
probably add <linux/workqueue.h> while you're at it.

Brian

>  	/* Device dump data/length */
>  	void *devdump_data;
>  	int devdump_len;
> -	struct timer_list devdump_timer;
> +	struct delayed_work devdump_work;
>  
>  	bool ignore_btcoex_events;
>  };
Duoming Zhou June 7, 2022, 12:22 a.m. UTC | #2
Hello,

On Mon, 6 Jun 2022 11:14:05 -0700 Brian wrote:

> On Fri, Jun 03, 2022 at 01:09:35PM +0800, Duoming Zhou wrote:
> > There are sleep in atomic context bugs when uploading device dump
> > data in mwifiex. The root cause is that dev_coredumpv could not
> > be used in atomic contexts, because it calls dev_set_name which
> > include operations that may sleep. The call tree shows execution
> > paths that could lead to bugs:
> ...
> > Fixes: f5ecd02a8b20 ("mwifiex: device dump support for usb interface")
> > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
> > ---
> > Changes in v5:
> >   - Use delayed work to replace timer.
> > 
> >  drivers/net/wireless/marvell/mwifiex/init.c      | 10 ++++++----
> >  drivers/net/wireless/marvell/mwifiex/main.h      |  2 +-
> >  drivers/net/wireless/marvell/mwifiex/sta_event.c |  6 +++---
> >  3 files changed, 10 insertions(+), 8 deletions(-)
> 
> Looks great! Thanks for working on this.
> 
> Reviewed-by: Brian Norris <briannorris@chromium.org>
> 
> Some small nitpicks below, but they're definitely not critical.

Thank you for your time and approval!

> > diff --git a/drivers/net/wireless/marvell/mwifiex/init.c b/drivers/net/wireless/marvell/mwifiex/init.c
> > index 88c72d1827a..3713f3e323f 100644
> > --- a/drivers/net/wireless/marvell/mwifiex/init.c
> > +++ b/drivers/net/wireless/marvell/mwifiex/init.c
> > @@ -63,9 +63,11 @@ static void wakeup_timer_fn(struct timer_list *t)
> >  		adapter->if_ops.card_reset(adapter);
> >  }
> >  
> > -static void fw_dump_timer_fn(struct timer_list *t)
> > +static void fw_dump_work(struct work_struct *work)
> >  {
> > -	struct mwifiex_adapter *adapter = from_timer(adapter, t, devdump_timer);
> > +	struct mwifiex_adapter *adapter = container_of(work,
> > +					struct mwifiex_adapter,
> > +					devdump_work.work);
> 
> Super nitpicky: the hanging indent style seems a bit off. I typically
> see people try to align to the first character after the parenthesis,
> like:
> 
> 	struct mwifiex_adapter *adapter = container_of(work,
> 						       struct mwifiex_adapter,
> 						       devdump_work.work);
> 
> It's not a clearly-specified style rule I think, so I definitely
> wouldn't insist.
> 
> On the bright side: I think the clang-format rules (in .clang-format)
> are getting better, so one can make some formatting decisions via tools
> instead of opinion and close reading! Unfortunately, we probably can't
> do that extensively and automatically, because I doubt people will love
> all the reformatting because of all the existing inconsistent style.
> 
> Anyway, to cut to the chase: clang-format chooses moving to a new line:
> 
> 	struct mwifiex_adapter *adapter =
> 		container_of(work, struct mwifiex_adapter, devdump_work.work);
> 
> More info if you're interested:
> https://www.kernel.org/doc/html/latest/process/clang-format.html
> 
> >  
> >  	mwifiex_upload_device_dump(adapter);
> >  }

Thanks for your suggestions! I will use clang-format to adjust the format.

> > diff --git a/drivers/net/wireless/marvell/mwifiex/main.h b/drivers/net/wireless/marvell/mwifiex/main.h
> > index 332dd1c8db3..6530c6ee308 100644
> > --- a/drivers/net/wireless/marvell/mwifiex/main.h
> > +++ b/drivers/net/wireless/marvell/mwifiex/main.h
> > @@ -1055,7 +1055,7 @@ struct mwifiex_adapter {
> 
> Nitpick: main.h is probably missing a lot of #includes, but you could
> probably add <linux/workqueue.h> while you're at it.

I will add <linux/workqueue.h> in main.h.

> 
> >  	/* Device dump data/length */
> >  	void *devdump_data;
> >  	int devdump_len;
> > -	struct timer_list devdump_timer;
> > +	struct delayed_work devdump_work;
> >  
> >  	bool ignore_btcoex_events;
> >  };

Best regards,
Duoming Zhou
diff mbox series

Patch

diff --git a/drivers/net/wireless/marvell/mwifiex/init.c b/drivers/net/wireless/marvell/mwifiex/init.c
index 88c72d1827a..3713f3e323f 100644
--- a/drivers/net/wireless/marvell/mwifiex/init.c
+++ b/drivers/net/wireless/marvell/mwifiex/init.c
@@ -63,9 +63,11 @@  static void wakeup_timer_fn(struct timer_list *t)
 		adapter->if_ops.card_reset(adapter);
 }
 
-static void fw_dump_timer_fn(struct timer_list *t)
+static void fw_dump_work(struct work_struct *work)
 {
-	struct mwifiex_adapter *adapter = from_timer(adapter, t, devdump_timer);
+	struct mwifiex_adapter *adapter = container_of(work,
+					struct mwifiex_adapter,
+					devdump_work.work);
 
 	mwifiex_upload_device_dump(adapter);
 }
@@ -321,7 +323,7 @@  static void mwifiex_init_adapter(struct mwifiex_adapter *adapter)
 	adapter->active_scan_triggered = false;
 	timer_setup(&adapter->wakeup_timer, wakeup_timer_fn, 0);
 	adapter->devdump_len = 0;
-	timer_setup(&adapter->devdump_timer, fw_dump_timer_fn, 0);
+	INIT_DELAYED_WORK(&adapter->devdump_work, fw_dump_work);
 }
 
 /*
@@ -400,7 +402,7 @@  static void
 mwifiex_adapter_cleanup(struct mwifiex_adapter *adapter)
 {
 	del_timer(&adapter->wakeup_timer);
-	del_timer_sync(&adapter->devdump_timer);
+	cancel_delayed_work_sync(&adapter->devdump_work);
 	mwifiex_cancel_all_pending_cmd(adapter);
 	wake_up_interruptible(&adapter->cmd_wait_q.wait);
 	wake_up_interruptible(&adapter->hs_activate_wait_q);
diff --git a/drivers/net/wireless/marvell/mwifiex/main.h b/drivers/net/wireless/marvell/mwifiex/main.h
index 332dd1c8db3..6530c6ee308 100644
--- a/drivers/net/wireless/marvell/mwifiex/main.h
+++ b/drivers/net/wireless/marvell/mwifiex/main.h
@@ -1055,7 +1055,7 @@  struct mwifiex_adapter {
 	/* Device dump data/length */
 	void *devdump_data;
 	int devdump_len;
-	struct timer_list devdump_timer;
+	struct delayed_work devdump_work;
 
 	bool ignore_btcoex_events;
 };
diff --git a/drivers/net/wireless/marvell/mwifiex/sta_event.c b/drivers/net/wireless/marvell/mwifiex/sta_event.c
index 7d42c5d2dbf..4d93386494c 100644
--- a/drivers/net/wireless/marvell/mwifiex/sta_event.c
+++ b/drivers/net/wireless/marvell/mwifiex/sta_event.c
@@ -623,8 +623,8 @@  mwifiex_fw_dump_info_event(struct mwifiex_private *priv,
 		 * transmission event get lost, in this cornel case,
 		 * user would still get partial of the dump.
 		 */
-		mod_timer(&adapter->devdump_timer,
-			  jiffies + msecs_to_jiffies(MWIFIEX_TIMER_10S));
+		schedule_delayed_work(&adapter->devdump_work,
+				      msecs_to_jiffies(MWIFIEX_TIMER_10S));
 	}
 
 	/* Overflow check */
@@ -643,7 +643,7 @@  mwifiex_fw_dump_info_event(struct mwifiex_private *priv,
 	return;
 
 upload_dump:
-	del_timer_sync(&adapter->devdump_timer);
+	cancel_delayed_work_sync(&adapter->devdump_work);
 	mwifiex_upload_device_dump(adapter);
 }