diff mbox series

mwifiex: Fix use-after-free bug due to race condition between main thread thread and timer thread

Message ID 20230218075956.1563118-1-zyytlz.wz@163.com (mailing list archive)
State Rejected
Delegated to: Kalle Valo
Headers show
Series mwifiex: Fix use-after-free bug due to race condition between main thread thread and timer thread | expand

Commit Message

Zheng Wang Feb. 18, 2023, 7:59 a.m. UTC
This is a potential race condition by executing the following order.

In summary, the adapter could be freed in timer function and be used after
that. The race condition needs 10s window which could be extended by the
paper : https://www.usenix.org/system/files/sec21-lee-yoochan.pdf

And the function in wakeup_timer_fn may have the same problem.

I dont't really know how to fix that, so I just removed the reset call,
which is totally wrong. If you know anything abouth the fix,
plz free to let me know.

Note that, this bug is found by static analysis, it could be wrong. We
could discuss that before writing the fix.

        CPU0                                                        CPU1
mwifiex_sdio_probe
mwifiex_add_card
mwifiex_init_hw_fw
request_firmware_nowait
  mwifiex_fw_dpc
    _mwifiex_fw_dpc
      mwifiex_init_fw
        mwifiex_main_process
          mwifiex_exec_next_cmd
            mwifiex_dnld_cmd_to_fw
              mod_timer(&adapter->cmd_timer,..)
                                                mwifiex_cmd_timeout_func
                                                  if_ops.card_reset(adapter)
                                                    mwifiex_sdio_card_reset
                                                      schedule_work(&card->work)
                                                        mwifiex_sdio_work
                                                          mwifiex_sdio_card_reset_work
                                                            mwifiex_reinit_sw
                                                              _mwifiex_fw_dpc
                                                                mwifiex_free_adapter
                                                                  mwifiex_unregister
                                                                    kfree(adapter)  //free adapter
                mwifiex_get_priv
                  // Use adapter

Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
---
 drivers/net/wireless/marvell/mwifiex/cmdevt.c | 2 --
 drivers/net/wireless/marvell/mwifiex/init.c   | 2 --
 2 files changed, 4 deletions(-)

Comments

Brian Norris Feb. 21, 2023, 9:59 p.m. UTC | #1
On Sat, Feb 18, 2023 at 03:59:56PM +0800, Zheng Wang wrote:
> Note that, this bug is found by static analysis, it could be wrong. We
> could discuss that before writing the fix.

Yeah, please don't accept this patch. It deserves an "RFC" in the title
at best. Sure, it's an identified race condition, but the cure here
(deleting all possible recovery from firmware crashes) is worse than the
disease.

There's no real attempt at analyzing the race or providing solutions, so
there's not much to discuss yet.

Brian
Zheng Hacker Feb. 22, 2023, 4:17 a.m. UTC | #2
Brian Norris <briannorris@chromium.org> 于2023年2月22日周三 11:31写道:
>
> On Sat, Feb 18, 2023 at 03:59:56PM +0800, Zheng Wang wrote:
> > Note that, this bug is found by static analysis, it could be wrong. We
> > could discuss that before writing the fix.
>
> Yeah, please don't accept this patch. It deserves an "RFC" in the title
> at best. Sure, it's an identified race condition, but the cure here
> (deleting all possible recovery from firmware crashes) is worse than the
> disease.
>

Hello Brain,

Thanks for your reply. I do need add "RFC" in the title. Sorry about that.
The invoking chain is as descriped in the original mail.

There is some place which may confuse you. In
mwifiex_exec_next_cmd function, the adapter is got from
cmd_node->priv->adapter. You might think if this is the same adapter
in outer function. In mwifiex_register function, it inits the
priv_arrary with
adapter->priv[i]->adapter = adapter,
Then use mwifiex_init_lock_list to init the list. When it fetch adapter like
calling mwifiex_cfg80211_get_tx_power, it travers the array to find the
target priv. So all the adapter paramter is pointed to the same adapter.
The UAF of it is vulnerable.

In summary, after the firmware is downloaded, it will start the timer function,
which is known as mwifiex_cmd_timeout_func. The if_ops.card_reset
function pointer is assigned with mwifiex_sdio_card_reset, which will
schedule_work the card->work. It finally pass the check becauese
card->work_flags has MWIFIEX_IFACE_WORK_CARD_RESET.

Finnaly, in _mwifiex_fw_dp, if init is failed, it will call mwifiex_free_adapter
and free the adapter.


> There's no real attempt at analyzing the race or providing solutions, so
> there's not much to discuss yet.

Yes, I did't figure out a good plan to fix. As I say, it free the adapter in the
error handling path. Could you please provide some advice about the fix?

Best regards,
Zheng
Brian Norris Feb. 22, 2023, 9:20 p.m. UTC | #3
On Wed, Feb 22, 2023 at 12:17:21PM +0800, Zheng Hacker wrote:
> Could you please provide some advice about the fix?

This entire driver's locking patterns (or lack
thereof) need rewritten. This driver was probably written by someone
that doesn't really understand concurrent programming. It really only
works because the bulk of normal operation is sequentialized into the
main loop (mwifiex_main_process()). Any time you get outside that,
you're likely to find bugs.

But now that I've looked a little further, I'm not confident you pointed
out a real bug. How does mwifiex_sdio_card_reset_work() get past
mwifiex_shutdown_sw() -> wait_for_completion(adapter->fw_done) ? That
should ensure that _mwifiex_fw_dpc() is finished, and so we can't hit
the race you point out.

Note to self: ignore most "static analysis" reports of race conditions,
unless they have thorough analysis or a runtime reproduction.

Brian
Zheng Hacker Feb. 24, 2023, 5:37 a.m. UTC | #4
Hello Brain,

Thanks for your detailed review. Sorry we missed something. As you say, We are
lacking some consideration, the pointed race path could not happen. But after
diving deep into the code, we found another path.

Here is the possible path. In summary, if the execution of CPU1 is
stuck by fw_done,
the cpu0 will continue executing and free the adapter gets into
error-path. The cpu1
did not notice that and UAF happened.

If there is something else we didn't know, please feel free to let us know.

        CPU0                                                        CPU1
mwifiex_sdio_probe
mwifiex_add_card
mwifiex_init_hw_fw
request_firmware_nowait
  mwifiex_fw_dpc
    _mwifiex_fw_dpc
      mwifiex_init_fw
        mwifiex_main_process
          mwifiex_exec_next_cmd
            mwifiex_dnld_cmd_to_fw
              mod_timer(&adapter->cmd_timer,..)
                                                        mwifiex_cmd_timeout_func

if_ops.card_reset(adapter)

mwifiex_sdio_card_reset
                                                            [*] stuck
in mwifiex_shutdown_sw
              return 0 in mwifiex_dnld_cmd_to_fw
                return 0 in mwifiex_exec_next_cmd
                  return 0 in  mwifiex_main_process
                    return -EINPROGRESS in mwifiex_init_fw
                      get into error path, mwifiex_free_adapter
                        // free adapter
                        complete_all(fw_done);
                                                              [*]Go on
                                                              Use
adapter->fw_done


Best regards,
Zheng


Brian Norris <briannorris@chromium.org> 于2023年2月23日周四 05:20写道:
>
> On Wed, Feb 22, 2023 at 12:17:21PM +0800, Zheng Hacker wrote:
> > Could you please provide some advice about the fix?
>
> This entire driver's locking patterns (or lack
> thereof) need rewritten. This driver was probably written by someone
> that doesn't really understand concurrent programming. It really only
> works because the bulk of normal operation is sequentialized into the
> main loop (mwifiex_main_process()). Any time you get outside that,
> you're likely to find bugs.
>
> But now that I've looked a little further, I'm not confident you pointed
> out a real bug. How does mwifiex_sdio_card_reset_work() get past
> mwifiex_shutdown_sw() -> wait_for_completion(adapter->fw_done) ? That
> should ensure that _mwifiex_fw_dpc() is finished, and so we can't hit
> the race you point out.
>
> Note to self: ignore most "static analysis" reports of race conditions,
> unless they have thorough analysis or a runtime reproduction.
>
> Brian
Zheng Hacker Feb. 24, 2023, 6:17 a.m. UTC | #5
This email is broken for the statement is too long, Here is the newest email.

Hello Brain,

Thanks for your detailed review. Sorry we missed something. As you say, We are
lacking some consideration, the pointed race path could not happen. But after
diving deep into the code, we found another path.

Here is the possible path. In summary, if the execution of CPU1 is
stuck by fw_done,
the cpu0 will continue executing and free the adapter gets into
error-path. The cpu1
did not notice that and UAF happened.

If there is something else we didn't know, please feel free to let us know.

        CPU0                                         CPU1
mwifiex_sdio_probe
mwifiex_add_card
mwifiex_init_hw_fw
request_firmware_nowait
  mwifiex_fw_dpc
    _mwifiex_fw_dpc
      mwifiex_init_fw
        mwifiex_main_process
          mwifiex_exec_next_cmd
            mwifiex_dnld_cmd_to_fw
              mod_timer(&adapter->cmd_timer,..)


                                              mwifiex_cmd_timeout_func
                                                if_ops.card_reset(adapter)
                                                  mwifiex_sdio_card_reset
                                                  [*] stuck in
mwifiex_shutdown_sw


              retn 0 in mwifiex_dnld_cmd_to_fw
              retn 0 in mwifiex_exec_next_cmd
              retn 0 in  mwifiex_main_process
              retn -EINPROGRESS in mwifiex_init_fw
              mwifiex_free_adapter when in error
              // free adapter
              complete_all(fw_done);


                                              [*]Go on
                                              Use adapter->fw_done

Best regards,
Zheng


Zheng Hacker <hackerzheng666@gmail.com> 于2023年2月24日周五 13:37写道:
>
> Hello Brain,
>
> Thanks for your detailed review. Sorry we missed something. As you say, We are
> lacking some consideration, the pointed race path could not happen. But after
> diving deep into the code, we found another path.
>
> Here is the possible path. In summary, if the execution of CPU1 is
> stuck by fw_done,
> the cpu0 will continue executing and free the adapter gets into
> error-path. The cpu1
> did not notice that and UAF happened.
>
> If there is something else we didn't know, please feel free to let us know.
>
>         CPU0                                                        CPU1
> mwifiex_sdio_probe
> mwifiex_add_card
> mwifiex_init_hw_fw
> request_firmware_nowait
>   mwifiex_fw_dpc
>     _mwifiex_fw_dpc
>       mwifiex_init_fw
>         mwifiex_main_process
>           mwifiex_exec_next_cmd
>             mwifiex_dnld_cmd_to_fw
>               mod_timer(&adapter->cmd_timer,..)
>                                                         mwifiex_cmd_timeout_func
>
> if_ops.card_reset(adapter)
>
> mwifiex_sdio_card_reset
>                                                             [*] stuck
> in mwifiex_shutdown_sw
>               return 0 in mwifiex_dnld_cmd_to_fw
>                 return 0 in mwifiex_exec_next_cmd
>                   return 0 in  mwifiex_main_process
>                     return -EINPROGRESS in mwifiex_init_fw
>                       get into error path, mwifiex_free_adapter
>                         // free adapter
>                         complete_all(fw_done);
>                                                               [*]Go on
>                                                               Use
> adapter->fw_done
>
>
> Best regards,
> Zheng
>
>
> Brian Norris <briannorris@chromium.org> 于2023年2月23日周四 05:20写道:
> >
> > On Wed, Feb 22, 2023 at 12:17:21PM +0800, Zheng Hacker wrote:
> > > Could you please provide some advice about the fix?
> >
> > This entire driver's locking patterns (or lack
> > thereof) need rewritten. This driver was probably written by someone
> > that doesn't really understand concurrent programming. It really only
> > works because the bulk of normal operation is sequentialized into the
> > main loop (mwifiex_main_process()). Any time you get outside that,
> > you're likely to find bugs.
> >
> > But now that I've looked a little further, I'm not confident you pointed
> > out a real bug. How does mwifiex_sdio_card_reset_work() get past
> > mwifiex_shutdown_sw() -> wait_for_completion(adapter->fw_done) ? That
> > should ensure that _mwifiex_fw_dpc() is finished, and so we can't hit
> > the race you point out.
> >
> > Note to self: ignore most "static analysis" reports of race conditions,
> > unless they have thorough analysis or a runtime reproduction.
> >
> > Brian
Brian Norris Feb. 24, 2023, 9:39 p.m. UTC | #6
On Fri, Feb 24, 2023 at 02:17:59PM +0800, Zheng Hacker wrote:
> This email is broken for the statement is too long, Here is the newest email.

It still wraps a bit weird, but it's good enough I suppose.

>               retn -EINPROGRESS in mwifiex_init_fw
>               mwifiex_free_adapter when in error

These two statements don't connect. _mwifiex_fw_dpc() only treats -1 as
a true error; -EINPROGRESS is treated as success, such that we continue
to wait for the command response. Now, we might hang here if that
response doesn't come, but that's a different problem...

I'm sure there are true bugs in here somewhere, but I've spent enough
time reading your incorrect reports and don't plan to spend more. (If
you're lucky, maybe you can pique my curiosity again, but don't count on
it.)

Regards,
Brian
Zheng Hacker March 2, 2023, 9:48 a.m. UTC | #7
Brian Norris <briannorris@chromium.org> 于2023年2月25日周六 05:39写道:
>
> On Fri, Feb 24, 2023 at 02:17:59PM +0800, Zheng Hacker wrote:
> > This email is broken for the statement is too long, Here is the newest email.
>
> It still wraps a bit weird, but it's good enough I suppose.
>
> >               retn -EINPROGRESS in mwifiex_init_fw
> >               mwifiex_free_adapter when in error
>
> These two statements don't connect. _mwifiex_fw_dpc() only treats -1 as
> a true error; -EINPROGRESS is treated as success, such that we continue
> to wait for the command response. Now, we might hang here if that
> response doesn't come, but that's a different problem...
>

Hi Brain,

Sorry for my unclear description. As you say, they don't have any connection.
What I really want to express is after mwifiex_init_fw return -EINPROGRESS
to its invoker, which is _mwifiex_fw_dpc. It will pass the check of
return value,
as the following code.
```cpp
ret = mwifiex_init_fw(adapter);
  if (ret == -1) {
    goto err_init_fw;
  } else if (!ret) {
    adapter->hw_status = MWIFIEX_HW_STATUS_READY;
    goto done;
  }
```

 it continues executing in _mwifiex_fw_dpc. Then in some unexpected situation,,
it'll get into error path like mwifiex_init_channel_scan_gap return non-zero
code if there is no more memory to use. It'll then get into err_init_chan_scan
label and call mwifiex_free_adapte finally.  The other thread may USE it
afterwards.

```cpp
if (mwifiex_init_channel_scan_gap(adapter)) {
    mwifiex_dbg(adapter, ERROR,
          "could not init channel stats table\n");
    goto err_init_chan_scan;
  }
```
> I'm sure there are true bugs in here somewhere, but I've spent enough
> time reading your incorrect reports and don't plan to spend more. (If
> you're lucky, maybe you can pique my curiosity again, but don't count on
> it.)
>

BUT after reviewing the code carefully, I found this might not happen
due to some  exclusive condition. So yes, I also think there is still some
problem here but I'm kind of busy nowadays. I promise to attach a clearer
report next time.

So sorry to bother you so many times. And appreciate for your precious
time spending on this report.

Best regards,
Zheng
diff mbox series

Patch

diff --git a/drivers/net/wireless/marvell/mwifiex/cmdevt.c b/drivers/net/wireless/marvell/mwifiex/cmdevt.c
index d3339d67e7a0..688dd451aba9 100644
--- a/drivers/net/wireless/marvell/mwifiex/cmdevt.c
+++ b/drivers/net/wireless/marvell/mwifiex/cmdevt.c
@@ -1016,8 +1016,6 @@  mwifiex_cmd_timeout_func(struct timer_list *t)
 	if (adapter->if_ops.device_dump)
 		adapter->if_ops.device_dump(adapter);
 
-	if (adapter->if_ops.card_reset)
-		adapter->if_ops.card_reset(adapter);
 }
 
 void
diff --git a/drivers/net/wireless/marvell/mwifiex/init.c b/drivers/net/wireless/marvell/mwifiex/init.c
index 7dddb4b5dea1..ff2d447c1de3 100644
--- a/drivers/net/wireless/marvell/mwifiex/init.c
+++ b/drivers/net/wireless/marvell/mwifiex/init.c
@@ -47,8 +47,6 @@  static void wakeup_timer_fn(struct timer_list *t)
 	adapter->hw_status = MWIFIEX_HW_STATUS_RESET;
 	mwifiex_cancel_all_pending_cmd(adapter);
 
-	if (adapter->if_ops.card_reset)
-		adapter->if_ops.card_reset(adapter);
 }
 
 static void fw_dump_work(struct work_struct *work)