diff mbox series

[net] ax25: fix use-after-free bugs caused by ax25_ds_del_timer

Message ID 20240326142542.118058-1-duoming@zju.edu.cn (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net] ax25: fix use-after-free bugs caused by ax25_ds_del_timer | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 945 this patch: 945
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 6 of 6 maintainers
netdev/build_clang success Errors and warnings before: 955 this patch: 955
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 956 this patch: 956
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 13 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-03-27--15-00 (tests: 952)

Commit Message

Duoming Zhou March 26, 2024, 2:25 p.m. UTC
When the ax25 device is detaching, the ax25_dev_device_down()
calls ax25_ds_del_timer() to cleanup the slave_timer. When
the timer handler is running, the ax25_ds_del_timer() that
calls del_timer() in it will return directly. As a result,
the use-after-free bugs could happen, one of the scenarios
is shown below:

      (Thread 1)          |      (Thread 2)
                          | ax25_ds_timeout()
ax25_dev_device_down()    |
  ax25_ds_del_timer()     |
    del_timer()           |
  ax25_dev_put() //FREE   |
                          |  ax25_dev-> //USE

In order to mitigate bugs, when the device is detaching, use
timer_shutdown_sync() to stop the timer.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
---
 net/ax25/ax25_ds_timer.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Simon Horman March 27, 2024, 7:10 p.m. UTC | #1
On Tue, Mar 26, 2024 at 10:25:42PM +0800, Duoming Zhou wrote:
> When the ax25 device is detaching, the ax25_dev_device_down()
> calls ax25_ds_del_timer() to cleanup the slave_timer. When
> the timer handler is running, the ax25_ds_del_timer() that
> calls del_timer() in it will return directly. As a result,
> the use-after-free bugs could happen, one of the scenarios
> is shown below:
> 
>       (Thread 1)          |      (Thread 2)
>                           | ax25_ds_timeout()
> ax25_dev_device_down()    |
>   ax25_ds_del_timer()     |
>     del_timer()           |
>   ax25_dev_put() //FREE   |
>                           |  ax25_dev-> //USE
> 
> In order to mitigate bugs, when the device is detaching, use
> timer_shutdown_sync() to stop the timer.

FWIIW, in my reading of things there is another failure mode whereby
ax25_ds_timeout may rearm the timer after the call to del_timer() but
before the call to ax25_dev_put().

> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
> ---
>  net/ax25/ax25_ds_timer.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ax25/ax25_ds_timer.c b/net/ax25/ax25_ds_timer.c
> index c4f8adbf814..5624c0d174c 100644
> --- a/net/ax25/ax25_ds_timer.c
> +++ b/net/ax25/ax25_ds_timer.c
> @@ -43,7 +43,12 @@ void ax25_ds_setup_timer(ax25_dev *ax25_dev)
>  
>  void ax25_ds_del_timer(ax25_dev *ax25_dev)
>  {
> -	if (ax25_dev)
> +	if (!ax25_dev)
> +		return;
> +
> +	if (!ax25_dev->device_up)
> +		timer_shutdown_sync(&ax25_dev->dama.slave_timer);
> +	else
>  		del_timer(&ax25_dev->dama.slave_timer);
>  }

I think that a) it is always correct to call timer_shutdown_sync,
and b) ax25_dev->device_up is always true. So a call to
timer_shutdown_sync can simply replace the call to del_timer.

Also, not strictly related, I think ax25_dev cannot be NULL,
so that check could be dropped. But perhaps that is better left alone.


Zooming out a bit, has removal of ax25 been considered.
I didn't check the logs thoroughly, but I'm not convinced it's been
maintained - other than clean-ups and by-inspection bug fixes - since git
history began.
Duoming Zhou March 28, 2024, 5:34 a.m. UTC | #2
On Wed, 27 Mar 2024 19:10:25 +0000 Simon Horman wrote:
> > When the ax25 device is detaching, the ax25_dev_device_down()
> > calls ax25_ds_del_timer() to cleanup the slave_timer. When
> > the timer handler is running, the ax25_ds_del_timer() that
> > calls del_timer() in it will return directly. As a result,
> > the use-after-free bugs could happen, one of the scenarios
> > is shown below:
> > 
> >       (Thread 1)          |      (Thread 2)
> >                           | ax25_ds_timeout()
> > ax25_dev_device_down()    |
> >   ax25_ds_del_timer()     |
> >     del_timer()           |
> >   ax25_dev_put() //FREE   |
> >                           |  ax25_dev-> //USE
> > 
> > In order to mitigate bugs, when the device is detaching, use
> > timer_shutdown_sync() to stop the timer.
> 
> FWIIW, in my reading of things there is another failure mode whereby
> ax25_ds_timeout may rearm the timer after the call to del_timer() but
> before the call to ax25_dev_put().

I think using timer_shutdown_sync() or del_timer_sync() to replace del_timer()
could prevent the rearm.

> > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
> > ---
> >  net/ax25/ax25_ds_timer.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/net/ax25/ax25_ds_timer.c b/net/ax25/ax25_ds_timer.c
> > index c4f8adbf814..5624c0d174c 100644
> > --- a/net/ax25/ax25_ds_timer.c
> > +++ b/net/ax25/ax25_ds_timer.c
> > @@ -43,7 +43,12 @@ void ax25_ds_setup_timer(ax25_dev *ax25_dev)
> >  
> >  void ax25_ds_del_timer(ax25_dev *ax25_dev)
> >  {
> > -	if (ax25_dev)
> > +	if (!ax25_dev)
> > +		return;
> > +
> > +	if (!ax25_dev->device_up)
> > +		timer_shutdown_sync(&ax25_dev->dama.slave_timer);
> > +	else
> >  		del_timer(&ax25_dev->dama.slave_timer);
> >  }
> 
> I think that a) it is always correct to call timer_shutdown_sync,
> and b) ax25_dev->device_up is always true. So a call to
> timer_shutdown_sync can simply replace the call to del_timer.

I think timer_shutdown*() is used for the code path to clean up the
driver or detach the device. If timer is shut down by timer_shutdown*(),
it could not be re-armed again unless we reinitialize the timer. The
slave_timer should only be shut down when the ax25 device is detaching or
the driver is removing. And it should not be shut down in other scenarios,
such as called in ax25_ds_state2_machine() or ax25_ds_state3_machine().
So I think calling timer_shutdown_sync() is not always correct.

What's more, the ax25_dev->device_up is not always true. It is set to
false in ax25_kill_by_device().

In a word, the timer_shutdown_sync() could not replace the del_timer()
completely.

> Also, not strictly related, I think ax25_dev cannot be NULL,
> so that check could be dropped. But perhaps that is better left alone.

The ax25_dev cannot not be NULL, because we only use ax25_dev_put() to
free the ax25_dev instead of setting is to NULL. So I think the check
could be dropped.

Do you think the following plan is proper?

diff --git a/net/ax25/ax25_ds_timer.c b/net/ax25/ax25_ds_timer.c
index c4f8adbf8144..f1cab4effa44 100644
--- a/net/ax25/ax25_ds_timer.c
+++ b/net/ax25/ax25_ds_timer.c
@@ -43,8 +43,7 @@ void ax25_ds_setup_timer(ax25_dev *ax25_dev)

 void ax25_ds_del_timer(ax25_dev *ax25_dev)
 {
-       if (ax25_dev)
-               del_timer(&ax25_dev->dama.slave_timer);
+       del_timer_sync(&ax25_dev->dama.slave_timer);
 }

There is no deadlock will happen.

> Zooming out a bit, has removal of ax25 been considered.
> I didn't check the logs thoroughly, but I'm not convinced it's been
> maintained - other than clean-ups and by-inspection bug fixes - since git
> history began.

Best regards,
Duoming Zhou
Simon Horman March 28, 2024, 6:12 p.m. UTC | #3
On Thu, Mar 28, 2024 at 01:34:48PM +0800, duoming@zju.edu.cn wrote:
> On Wed, 27 Mar 2024 19:10:25 +0000 Simon Horman wrote:
> > > When the ax25 device is detaching, the ax25_dev_device_down()
> > > calls ax25_ds_del_timer() to cleanup the slave_timer. When
> > > the timer handler is running, the ax25_ds_del_timer() that
> > > calls del_timer() in it will return directly. As a result,
> > > the use-after-free bugs could happen, one of the scenarios
> > > is shown below:
> > > 
> > >       (Thread 1)          |      (Thread 2)
> > >                           | ax25_ds_timeout()
> > > ax25_dev_device_down()    |
> > >   ax25_ds_del_timer()     |
> > >     del_timer()           |
> > >   ax25_dev_put() //FREE   |
> > >                           |  ax25_dev-> //USE
> > > 
> > > In order to mitigate bugs, when the device is detaching, use
> > > timer_shutdown_sync() to stop the timer.
> > 
> > FWIIW, in my reading of things there is another failure mode whereby
> > ax25_ds_timeout may rearm the timer after the call to del_timer() but
> > before the call to ax25_dev_put().
> 
> I think using timer_shutdown_sync() or del_timer_sync() to replace del_timer()
> could prevent the rearm.

I think only timer_shutdown() and timer_shutdown_sync() will prevent a
rearm. But I also think (but am not entirely sure) this is only important
in the ax25_dev_device_down() case (there are others, as you mention
below).

> > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
> > > ---
> > >  net/ax25/ax25_ds_timer.c | 7 ++++++-
> > >  1 file changed, 6 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/net/ax25/ax25_ds_timer.c b/net/ax25/ax25_ds_timer.c
> > > index c4f8adbf814..5624c0d174c 100644
> > > --- a/net/ax25/ax25_ds_timer.c
> > > +++ b/net/ax25/ax25_ds_timer.c
> > > @@ -43,7 +43,12 @@ void ax25_ds_setup_timer(ax25_dev *ax25_dev)
> > >  
> > >  void ax25_ds_del_timer(ax25_dev *ax25_dev)
> > >  {
> > > -	if (ax25_dev)
> > > +	if (!ax25_dev)
> > > +		return;
> > > +
> > > +	if (!ax25_dev->device_up)
> > > +		timer_shutdown_sync(&ax25_dev->dama.slave_timer);
> > > +	else
> > >  		del_timer(&ax25_dev->dama.slave_timer);
> > >  }
> > 
> > I think that a) it is always correct to call timer_shutdown_sync,
> > and b) ax25_dev->device_up is always true. So a call to
> > timer_shutdown_sync can simply replace the call to del_timer.
> 
> I think timer_shutdown*() is used for the code path to clean up the
> driver or detach the device. If timer is shut down by timer_shutdown*(),
> it could not be re-armed again unless we reinitialize the timer. The
> slave_timer should only be shut down when the ax25 device is detaching or
> the driver is removing. And it should not be shut down in other scenarios,
> such as called in ax25_ds_state2_machine() or ax25_ds_state3_machine().
> So I think calling timer_shutdown_sync() is not always correct.
> 
> What's more, the ax25_dev->device_up is not always true. It is set to
> false in ax25_kill_by_device().
> 
> In a word, the timer_shutdown_sync() could not replace the del_timer()
> completely.

Yes, sorry. I missed that ax25_ds_del_timer() is not
only called from ax25_dev_device_down().

> > Also, not strictly related, I think ax25_dev cannot be NULL,
> > so that check could be dropped. But perhaps that is better left alone.
> 
> The ax25_dev cannot not be NULL, because we only use ax25_dev_put() to
> free the ax25_dev instead of setting is to NULL. So I think the check
> could be dropped.
> 
> Do you think the following plan is proper?
> 
> diff --git a/net/ax25/ax25_ds_timer.c b/net/ax25/ax25_ds_timer.c
> index c4f8adbf8144..f1cab4effa44 100644
> --- a/net/ax25/ax25_ds_timer.c
> +++ b/net/ax25/ax25_ds_timer.c
> @@ -43,8 +43,7 @@ void ax25_ds_setup_timer(ax25_dev *ax25_dev)
> 
>  void ax25_ds_del_timer(ax25_dev *ax25_dev)
>  {
> -       if (ax25_dev)
> -               del_timer(&ax25_dev->dama.slave_timer);
> +       del_timer_sync(&ax25_dev->dama.slave_timer);
>  }
> 
> There is no deadlock will happen.

I'm actually getting to think that your original patch was correct.
But perhaps a different approach would be to simply call
timer_shutdown_sync() in ax25_dev_device_down(). And leave
ax25_ds_del_timer() alone.

> 
> > Zooming out a bit, has removal of ax25 been considered.
> > I didn't check the logs thoroughly, but I'm not convinced it's been
> > maintained - other than clean-ups and by-inspection bug fixes - since git
> > history began.
> 
> Best regards,
> Duoming Zhou
Duoming Zhou March 29, 2024, 1:54 a.m. UTC | #4
On Thu, 28 Mar 2024 18:12:50 +0000 Simon Horman wrote:
> > > > When the ax25 device is detaching, the ax25_dev_device_down()
> > > > calls ax25_ds_del_timer() to cleanup the slave_timer. When
> > > > the timer handler is running, the ax25_ds_del_timer() that
> > > > calls del_timer() in it will return directly. As a result,
> > > > the use-after-free bugs could happen, one of the scenarios
> > > > is shown below:
> > > > 
> > > >       (Thread 1)          |      (Thread 2)
> > > >                           | ax25_ds_timeout()
> > > > ax25_dev_device_down()    |
> > > >   ax25_ds_del_timer()     |
> > > >     del_timer()           |
> > > >   ax25_dev_put() //FREE   |
> > > >                           |  ax25_dev-> //USE
> > > > 
> > > > In order to mitigate bugs, when the device is detaching, use
> > > > timer_shutdown_sync() to stop the timer.
> > > 
> > > FWIIW, in my reading of things there is another failure mode whereby
> > > ax25_ds_timeout may rearm the timer after the call to del_timer() but
> > > before the call to ax25_dev_put().
> > 
> > I think using timer_shutdown_sync() or del_timer_sync() to replace del_timer()
> > could prevent the rearm.
> 
> I think only timer_shutdown() and timer_shutdown_sync() will prevent a
> rearm. But I also think (but am not entirely sure) this is only important
> in the ax25_dev_device_down() case (there are others, as you mention
> below).

When timer is rearmed in it's handler, the del_timer_sync() could prevent the
rearming. But when timer is rearmed in other threads, the del_timer_sync() could
not prevent it. The following code is apart of the del_timer_sync().

	do {
		ret = __try_to_del_timer_sync(timer, shutdown);

		if (unlikely(ret < 0)) {
			del_timer_wait_running(timer);
			cpu_relax();
		}
	} while (ret < 0);

In the ax25_dev_device_down() case, I think it is better to use 
timer_shutdown_sync().

> > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > > > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
> > > > ---
> > > >  net/ax25/ax25_ds_timer.c | 7 ++++++-
> > > >  1 file changed, 6 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/net/ax25/ax25_ds_timer.c b/net/ax25/ax25_ds_timer.c
> > > > index c4f8adbf814..5624c0d174c 100644
> > > > --- a/net/ax25/ax25_ds_timer.c
> > > > +++ b/net/ax25/ax25_ds_timer.c
> > > > @@ -43,7 +43,12 @@ void ax25_ds_setup_timer(ax25_dev *ax25_dev)
> > > >  
> > > >  void ax25_ds_del_timer(ax25_dev *ax25_dev)
> > > >  {
> > > > -	if (ax25_dev)
> > > > +	if (!ax25_dev)
> > > > +		return;
> > > > +
> > > > +	if (!ax25_dev->device_up)
> > > > +		timer_shutdown_sync(&ax25_dev->dama.slave_timer);
> > > > +	else
> > > >  		del_timer(&ax25_dev->dama.slave_timer);
> > > >  }
> > > 
> > > I think that a) it is always correct to call timer_shutdown_sync,
> > > and b) ax25_dev->device_up is always true. So a call to
> > > timer_shutdown_sync can simply replace the call to del_timer.
> > 
> > I think timer_shutdown*() is used for the code path to clean up the
> > driver or detach the device. If timer is shut down by timer_shutdown*(),
> > it could not be re-armed again unless we reinitialize the timer. The
> > slave_timer should only be shut down when the ax25 device is detaching or
> > the driver is removing. And it should not be shut down in other scenarios,
> > such as called in ax25_ds_state2_machine() or ax25_ds_state3_machine().
> > So I think calling timer_shutdown_sync() is not always correct.
> > 
> > What's more, the ax25_dev->device_up is not always true. It is set to
> > false in ax25_kill_by_device().
> > 
> > In a word, the timer_shutdown_sync() could not replace the del_timer()
> > completely.
> 
> Yes, sorry. I missed that ax25_ds_del_timer() is not
> only called from ax25_dev_device_down().
> 
> > > Also, not strictly related, I think ax25_dev cannot be NULL,
> > > so that check could be dropped. But perhaps that is better left alone.
> > 
> > The ax25_dev cannot not be NULL, because we only use ax25_dev_put() to
> > free the ax25_dev instead of setting is to NULL. So I think the check
> > could be dropped.
> > 
> > Do you think the following plan is proper?
> > 
> > diff --git a/net/ax25/ax25_ds_timer.c b/net/ax25/ax25_ds_timer.c
> > index c4f8adbf8144..f1cab4effa44 100644
> > --- a/net/ax25/ax25_ds_timer.c
> > +++ b/net/ax25/ax25_ds_timer.c
> > @@ -43,8 +43,7 @@ void ax25_ds_setup_timer(ax25_dev *ax25_dev)
> > 
> >  void ax25_ds_del_timer(ax25_dev *ax25_dev)
> >  {
> > -       if (ax25_dev)
> > -               del_timer(&ax25_dev->dama.slave_timer);
> > +       del_timer_sync(&ax25_dev->dama.slave_timer);
> >  }
> > 
> > There is no deadlock will happen.
> 
> I'm actually getting to think that your original patch was correct.
> But perhaps a different approach would be to simply call
> timer_shutdown_sync() in ax25_dev_device_down(). And leave
> ax25_ds_del_timer() alone.

I think using timer_shutdown_sync() in ax25_dev_device_down() and
leaving ax25_ds_del_timer() alone is better than the original patch.

Thank you for your suggestions!

Best regards,
Duoming Zhou
diff mbox series

Patch

diff --git a/net/ax25/ax25_ds_timer.c b/net/ax25/ax25_ds_timer.c
index c4f8adbf814..5624c0d174c 100644
--- a/net/ax25/ax25_ds_timer.c
+++ b/net/ax25/ax25_ds_timer.c
@@ -43,7 +43,12 @@  void ax25_ds_setup_timer(ax25_dev *ax25_dev)
 
 void ax25_ds_del_timer(ax25_dev *ax25_dev)
 {
-	if (ax25_dev)
+	if (!ax25_dev)
+		return;
+
+	if (!ax25_dev->device_up)
+		timer_shutdown_sync(&ax25_dev->dama.slave_timer);
+	else
 		del_timer(&ax25_dev->dama.slave_timer);
 }