diff mbox series

[net] sierra_net: Fix use-after-free on unbind

Message ID 80e88f61ca68c36ebce5d17dfcaa8e956e19fb2f.1655196227.git.lukas@wunner.de (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net] sierra_net: Fix use-after-free on unbind | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers fail 1 blamed authors not CCed: dcbw@redhat.com; 1 maintainers not CCed: dcbw@redhat.com
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 16 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Lukas Wunner June 14, 2022, 8:50 a.m. UTC
On unbind, the Sierra USB WWAN driver cancels the sierra_net_kevent()
work, then stops polling for interrupts by calling usbnet_status_stop().

However the interrupt handler sierra_net_status() may re-schedule the
work after it's been canceled and thus cause a use-after-free.

Fix by inverting the teardown order.

Fixes: 7b0c5f21f348 ("sierra_net: keep status interrupt URB active")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v3.10+
Cc: Dan Williams <dan.j.williams@intel.com>
---
 drivers/net/usb/sierra_net.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Oliver Neukum June 14, 2022, 10:48 a.m. UTC | #1
On 14.06.22 10:50, Lukas Wunner wrote:

> @@ -758,6 +758,8 @@ static void sierra_net_unbind(struct usbnet *dev, struct usb_interface *intf)
>  
>  	dev_dbg(&dev->udev->dev, "%s", __func__);
>  
> +	usbnet_status_stop(dev);
> +
>  	/* kill the timer and work */
>  	del_timer_sync(&priv->sync_timer);
>  	cancel_work_sync(&priv->sierra_net_kevent);

Hi,

as far as I can see the following race condition exists:


CPU A:

intr_complete() -> static void sierra_net_status() -> defer_kevent()

									CPU B:

usbnet_stop_status()  ---- kills the URB but only the URB, kevent scheduled

CPU A:

sierra_net_kevent -> sierra_net_dosync() ->

CPU B:
-> del_timer_sync(&priv->sync_timer);  ---- NOP, too early

CPU A:

add_timer(&priv->sync_timer);

CPU B:

cancel_work_sync(&priv->sierra_net_kevent);  ---- NOP, too late

	Regards
		Oliver
Lukas Wunner June 21, 2022, 4:43 p.m. UTC | #2
[adding Jann as UAF connoisseur to cc]

On Tue, Jun 14, 2022 at 12:48:23PM +0200, Oliver Neukum wrote:
> On 14.06.22 10:50, Lukas Wunner wrote:
> > @@ -758,6 +758,8 @@ static void sierra_net_unbind(struct usbnet *dev, struct usb_interface *intf)
> >  
> >  	dev_dbg(&dev->udev->dev, "%s", __func__);
> >  
> > +	usbnet_status_stop(dev);
> > +
> >  	/* kill the timer and work */
> >  	del_timer_sync(&priv->sync_timer);
> >  	cancel_work_sync(&priv->sierra_net_kevent);
> 
> as far as I can see the following race condition exists:
> 
> CPU A:
> intr_complete() -> static void sierra_net_status() -> defer_kevent()
> 
> CPU B:
> usbnet_stop_status()  ---- kills the URB but only the URB, kevent scheduled
> 
> CPU A:
> sierra_net_kevent -> sierra_net_dosync() ->
> 
> CPU B:
> -> del_timer_sync(&priv->sync_timer);  ---- NOP, too early
> 
> CPU A:
> add_timer(&priv->sync_timer);
> 
> CPU B:
> cancel_work_sync(&priv->sierra_net_kevent);  ---- NOP, too late

I see your point, but what's the solution?

I could call netif_device_detach() on ->disconnect(), then avoid
scheduling sierra_net_kevent in the timer if !netif_device_present(),
and also avoid arming the timer in sierra_net_kevent under the same
condition.

Still, I think I'd need 3 calls to make this bulletproof, either

	del_timer_sync(&priv->sync_timer);
	cancel_work_sync(&priv->sierra_net_kevent);
	del_timer_sync(&priv->sync_timer);

or

	cancel_work_sync(&priv->sierra_net_kevent);
	del_timer_sync(&priv->sync_timer);
	cancel_work_sync(&priv->sierra_net_kevent);

Doesn't really matter which of these two.  Am I right?
Is there a better (simpler) approach?

FWIW, the logic in usbnet.c looks similarly flawed:
There's a timer, a tasklet and a work.
(Sounds like one of those "... walk into a bar" jokes.)

The timer is armed by the tx/rx URB completion callbacks.
Those URBs are terminated in usbnet_stop() and the timer is
deleted.  So far so good.  But:

The tasklet schedules the work.
The work schedules the tasklet.
The tasklet also schedules itself.

We kill the tasklet in usbnet_stop() and afterwards cancel the
work in usbnet_disconnect().  What happens if the work schedules
the tasklet again?  Another UAF.  That may happen in the EVENT_RX_HALT,
EVENT_RX_MEMORY, EVENT_LINK_RESET and EVENT_LINK_CHANGE code paths.
A few netif_device_present() safeguards may help to prevent
rescheduling the killed tasklet, but I suspect we may again need
3 calls here (tasklet_kill() / cancel_work_sync() / tasklet_kill())
to make it bulletproof.  What do you think?

As a heads-up, I'm going to move the cancel_work_sync() to usbnet_stop()
in an upcoming patch.  That seems to be Jakub's preferred approach to
tackle the linkwatch UAF.  I fear it may increase the risk of encountering
the issues outlined above as the time between tasklet_kill() and
cancel_work_sync() is reduced:

https://github.com/l1k/linux/commit/89988b499ab9

Thanks,

Lukas
Oliver Neukum June 22, 2022, 7:42 a.m. UTC | #3
On 21.06.22 18:43, Lukas Wunner wrote:
> [adding Jann as UAF connoisseur to cc]
> 
> On Tue, Jun 14, 2022 at 12:48:23PM +0200, Oliver Neukum wrote:

>>
>> as far as I can see the following race condition exists:
>>
>> CPU A:
>> intr_complete() -> static void sierra_net_status() -> defer_kevent()
>>
>> CPU B:
>> usbnet_stop_status()  ---- kills the URB but only the URB, kevent scheduled
>>
>> CPU A:
>> sierra_net_kevent -> sierra_net_dosync() ->
>>
>> CPU B:
>> -> del_timer_sync(&priv->sync_timer);  ---- NOP, too early
>>
>> CPU A:
>> add_timer(&priv->sync_timer);
>>
>> CPU B:
>> cancel_work_sync(&priv->sierra_net_kevent);  ---- NOP, too late
> 
> I see your point, but what's the solution?

That is hard to say. I will go as far as saying that this will need
a flag indicating a status of currently being disconnected.

> I could call netif_device_detach() on ->disconnect(), then avoid
> scheduling sierra_net_kevent in the timer if !netif_device_present(),
> and also avoid arming the timer in sierra_net_kevent under the same
> condition.

A flag you get by using netif_device_present() as a flag.

Yet the idea of shifting things around in the disconnect() code
path of a USB network driver just to solve some other issue makes
me very nervous.
If you decide that this needs a flag, please add a dedicated flag.

> 
> Still, I think I'd need 3 calls to make this bulletproof, either
> 
> 	del_timer_sync(&priv->sync_timer);
> 	cancel_work_sync(&priv->sierra_net_kevent);
> 	del_timer_sync(&priv->sync_timer);
> 
> or
> 
> 	cancel_work_sync(&priv->sierra_net_kevent);
> 	del_timer_sync(&priv->sync_timer);
> 	cancel_work_sync(&priv->sierra_net_kevent);
> 
> Doesn't really matter which of these two.  Am I right?

Yes, that is right.

> Is there a better (simpler) approach?

I am thinking about causing the timer and the kevent fail
their URB submissions.

> FWIW, the logic in usbnet.c looks similarly flawed:
> There's a timer, a tasklet and a work.
> (Sounds like one of those "... walk into a bar" jokes.)

We need somebody to start a web comic based on that.

> The timer is armed by the tx/rx URB completion callbacks.
> Those URBs are terminated in usbnet_stop() and the timer is
> deleted.  So far so good.  But:
> 
> The tasklet schedules the work.
> The work schedules the tasklet.
> The tasklet also schedules itself.

This test is supposed to make the kevent save from itself:

        if (netif_running (dev->net) &&
            netif_device_present (dev->net) &&

Do you think it is insufficient?

I must admit the logic in usbnet is not easy to understand.
In fact it may not be possible to understand at all.

> We kill the tasklet in usbnet_stop() and afterwards cancel the
> work in usbnet_disconnect().  What happens if the work schedules
> the tasklet again?  Another UAF.  That may happen in the EVENT_RX_HALT,
> EVENT_RX_MEMORY, EVENT_LINK_RESET and EVENT_LINK_CHANGE code paths.
> A few netif_device_present() safeguards may help to prevent
> rescheduling the killed tasklet, but I suspect we may again need
> 3 calls here (tasklet_kill() / cancel_work_sync() / tasklet_kill())
> to make it bulletproof.  What do you think?

I think that this is unsalvagaeble. We'd fare better with a clear
"zombie" flag we test before firing off anything.

> As a heads-up, I'm going to move the cancel_work_sync() to usbnet_stop()
> in an upcoming patch.  That seems to be Jakub's preferred approach to
> tackle the linkwatch UAF.  I fear it may increase the risk of encountering
> the issues outlined above as the time between tasklet_kill() and
> cancel_work_sync() is reduced:
> 
> https://github.com/l1k/linux/commit/89988b499ab9

Please do go ahead to adapt it to the way the big network drivers need it.
You have now convinced me that usbnet needs major surgery. This means
work in merging for me in any case. Feel free to do what serves the
users best. Usbnet is a  framework. It should be formed by what drivers
need, not the other way round.

	Regards
		Oliver
diff mbox series

Patch

diff --git a/drivers/net/usb/sierra_net.c b/drivers/net/usb/sierra_net.c
index bb4cbe8fc846..197e1356ae98 100644
--- a/drivers/net/usb/sierra_net.c
+++ b/drivers/net/usb/sierra_net.c
@@ -758,6 +758,8 @@  static void sierra_net_unbind(struct usbnet *dev, struct usb_interface *intf)
 
 	dev_dbg(&dev->udev->dev, "%s", __func__);
 
+	usbnet_status_stop(dev);
+
 	/* kill the timer and work */
 	del_timer_sync(&priv->sync_timer);
 	cancel_work_sync(&priv->sierra_net_kevent);
@@ -769,8 +771,6 @@  static void sierra_net_unbind(struct usbnet *dev, struct usb_interface *intf)
 		netdev_err(dev->net,
 			"usb_control_msg failed, status %d\n", status);
 
-	usbnet_status_stop(dev);
-
 	sierra_net_set_private(dev, NULL);
 	kfree(priv);
 }