Message ID | 20220630143842.24906-1-duoming@zju.edu.cn (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] net: rose: fix UAF bug caused by rose_t0timer_expiry | expand |
On Thu, Jun 30, 2022 at 4:38 PM Duoming Zhou <duoming@zju.edu.cn> wrote: > > There are UAF bugs caused by rose_t0timer_expiry(). The > root cause is that del_timer() could not stop the timer > handler that is running and there is no synchronization. > One of the race conditions is shown below: > > (thread 1) | (thread 2) > | rose_device_event > | rose_rt_device_down > | rose_remove_neigh > rose_t0timer_expiry | rose_stop_t0timer(rose_neigh) > ... | del_timer(&neigh->t0timer) > | kfree(rose_neigh) //[1]FREE > neigh->dce_mode //[2]USE | > > The rose_neigh is deallocated in position [1] and use in > position [2]. > > The crash trace triggered by POC is like below: > > BUG: KASAN: use-after-free in expire_timers+0x144/0x320 > Write of size 8 at addr ffff888009b19658 by task swapper/0/0 > ... > Call Trace: > <IRQ> > dump_stack_lvl+0xbf/0xee > print_address_description+0x7b/0x440 > print_report+0x101/0x230 > ? expire_timers+0x144/0x320 > kasan_report+0xed/0x120 > ? expire_timers+0x144/0x320 > expire_timers+0x144/0x320 > __run_timers+0x3ff/0x4d0 > run_timer_softirq+0x41/0x80 > __do_softirq+0x233/0x544 > ... > > This patch changes del_timer() in rose_stop_t0timer() and > rose_stop_ftimer() to del_timer_sync() in order that the > timer handler could be finished before the resources such as > rose_neigh and so on are deallocated. As a result, the UAF > bugs could be mitigated. > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> > --- > net/rose/rose_link.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/net/rose/rose_link.c b/net/rose/rose_link.c > index 8b96a56d3a4..9734d1264de 100644 > --- a/net/rose/rose_link.c > +++ b/net/rose/rose_link.c > @@ -54,12 +54,12 @@ static void rose_start_t0timer(struct rose_neigh *neigh) > > void rose_stop_ftimer(struct rose_neigh *neigh) > { > - del_timer(&neigh->ftimer); > + del_timer_sync(&neigh->ftimer); > } Are you sure this is safe ? del_timer_sync() could hang if the caller holds a lock that the timer function would need to acquire. > > void rose_stop_t0timer(struct rose_neigh *neigh) > { > - del_timer(&neigh->t0timer); > + del_timer_sync(&neigh->t0timer); > } Same here, please explain why it is safe. > > int rose_ftimer_running(struct rose_neigh *neigh) > -- > 2.17.1 >
Hello, On Thu, 30 Jun 2022 16:44:29 +0200 Eric Dumazet wrote: > > There are UAF bugs caused by rose_t0timer_expiry(). The > > root cause is that del_timer() could not stop the timer > > handler that is running and there is no synchronization. > > One of the race conditions is shown below: > > > > (thread 1) | (thread 2) > > | rose_device_event > > | rose_rt_device_down > > | rose_remove_neigh > > rose_t0timer_expiry | rose_stop_t0timer(rose_neigh) > > ... | del_timer(&neigh->t0timer) > > | kfree(rose_neigh) //[1]FREE > > neigh->dce_mode //[2]USE | > > > > The rose_neigh is deallocated in position [1] and use in > > position [2]. > > > > The crash trace triggered by POC is like below: > > > > BUG: KASAN: use-after-free in expire_timers+0x144/0x320 > > Write of size 8 at addr ffff888009b19658 by task swapper/0/0 > > ... > > Call Trace: > > <IRQ> > > dump_stack_lvl+0xbf/0xee > > print_address_description+0x7b/0x440 > > print_report+0x101/0x230 > > ? expire_timers+0x144/0x320 > > kasan_report+0xed/0x120 > > ? expire_timers+0x144/0x320 > > expire_timers+0x144/0x320 > > __run_timers+0x3ff/0x4d0 > > run_timer_softirq+0x41/0x80 > > __do_softirq+0x233/0x544 > > ... > > > > This patch changes del_timer() in rose_stop_t0timer() and > > rose_stop_ftimer() to del_timer_sync() in order that the > > timer handler could be finished before the resources such as > > rose_neigh and so on are deallocated. As a result, the UAF > > bugs could be mitigated. > > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> > > --- > > net/rose/rose_link.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/net/rose/rose_link.c b/net/rose/rose_link.c > > index 8b96a56d3a4..9734d1264de 100644 > > --- a/net/rose/rose_link.c > > +++ b/net/rose/rose_link.c > > @@ -54,12 +54,12 @@ static void rose_start_t0timer(struct rose_neigh *neigh) > > > > void rose_stop_ftimer(struct rose_neigh *neigh) > > { > > - del_timer(&neigh->ftimer); > > + del_timer_sync(&neigh->ftimer); > > } > > Are you sure this is safe ? > > del_timer_sync() could hang if the caller holds a lock that the timer > function would need to acquire. I think this is safe. The rose_ftimer_expiry() is an empty function that is shown below: static void rose_ftimer_expiry(struct timer_list *t) { } > > > > void rose_stop_t0timer(struct rose_neigh *neigh) > > { > > - del_timer(&neigh->t0timer); > > + del_timer_sync(&neigh->t0timer); > > } > > Same here, please explain why it is safe. The rose_stop_t0timer() may hold "rose_node_list_lock" and "rose_neigh_list_lock", but the timer handler rose_t0timer_expiry() that is shown below does not need these two locks. static void rose_t0timer_expiry(struct timer_list *t) { struct rose_neigh *neigh = from_timer(neigh, t, t0timer); rose_transmit_restart_request(neigh); neigh->dce_mode = 0; rose_start_t0timer(neigh); } Best regards, Duoming Zhou
On Thu, Jun 30, 2022 at 5:08 PM <duoming@zju.edu.cn> wrote: > > Hello, > > On Thu, 30 Jun 2022 16:44:29 +0200 Eric Dumazet wrote: > > > > There are UAF bugs caused by rose_t0timer_expiry(). The > > > root cause is that del_timer() could not stop the timer > > > handler that is running and there is no synchronization. > > > One of the race conditions is shown below: > > > > > > (thread 1) | (thread 2) > > > | rose_device_event > > > | rose_rt_device_down > > > | rose_remove_neigh > > > rose_t0timer_expiry | rose_stop_t0timer(rose_neigh) > > > ... | del_timer(&neigh->t0timer) > > > | kfree(rose_neigh) //[1]FREE > > > neigh->dce_mode //[2]USE | > > > > > > The rose_neigh is deallocated in position [1] and use in > > > position [2]. > > > > > > The crash trace triggered by POC is like below: > > > > > > BUG: KASAN: use-after-free in expire_timers+0x144/0x320 > > > Write of size 8 at addr ffff888009b19658 by task swapper/0/0 > > > ... > > > Call Trace: > > > <IRQ> > > > dump_stack_lvl+0xbf/0xee > > > print_address_description+0x7b/0x440 > > > print_report+0x101/0x230 > > > ? expire_timers+0x144/0x320 > > > kasan_report+0xed/0x120 > > > ? expire_timers+0x144/0x320 > > > expire_timers+0x144/0x320 > > > __run_timers+0x3ff/0x4d0 > > > run_timer_softirq+0x41/0x80 > > > __do_softirq+0x233/0x544 > > > ... > > > > > > This patch changes del_timer() in rose_stop_t0timer() and > > > rose_stop_ftimer() to del_timer_sync() in order that the > > > timer handler could be finished before the resources such as > > > rose_neigh and so on are deallocated. As a result, the UAF > > > bugs could be mitigated. > > > > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > > > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> > > > --- > > > net/rose/rose_link.c | 4 ++-- > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > diff --git a/net/rose/rose_link.c b/net/rose/rose_link.c > > > index 8b96a56d3a4..9734d1264de 100644 > > > --- a/net/rose/rose_link.c > > > +++ b/net/rose/rose_link.c > > > @@ -54,12 +54,12 @@ static void rose_start_t0timer(struct rose_neigh *neigh) > > > > > > void rose_stop_ftimer(struct rose_neigh *neigh) > > > { > > > - del_timer(&neigh->ftimer); > > > + del_timer_sync(&neigh->ftimer); > > > } > > > > Are you sure this is safe ? > > > > del_timer_sync() could hang if the caller holds a lock that the timer > > function would need to acquire. > > I think this is safe. The rose_ftimer_expiry() is an empty function that is > shown below: > > static void rose_ftimer_expiry(struct timer_list *t) > { > } > > > > > > > void rose_stop_t0timer(struct rose_neigh *neigh) > > > { > > > - del_timer(&neigh->t0timer); > > > + del_timer_sync(&neigh->t0timer); > > > } > > > > Same here, please explain why it is safe. > > The rose_stop_t0timer() may hold "rose_node_list_lock" and "rose_neigh_list_lock", > but the timer handler rose_t0timer_expiry() that is shown below does not need > these two locks. > > static void rose_t0timer_expiry(struct timer_list *t) > { > struct rose_neigh *neigh = from_timer(neigh, t, t0timer); > > rose_transmit_restart_request(neigh); > > neigh->dce_mode = 0; > > rose_start_t0timer(neigh); This will rearm the timer. del_timer_sync() will not help. Please read the comment in front of del_timer_sync(), in kernel/time/timer.c > } > > Best regards, > Duoming Zhou
Hello, On Thu, 30 Jun 2022 17:17:10 +0200 Eric Dumazet wrote: > > > > There are UAF bugs caused by rose_t0timer_expiry(). The > > > > root cause is that del_timer() could not stop the timer > > > > handler that is running and there is no synchronization. > > > > One of the race conditions is shown below: > > > > > > > > (thread 1) | (thread 2) > > > > | rose_device_event > > > > | rose_rt_device_down > > > > | rose_remove_neigh > > > > rose_t0timer_expiry | rose_stop_t0timer(rose_neigh) > > > > ... | del_timer(&neigh->t0timer) > > > > | kfree(rose_neigh) //[1]FREE > > > > neigh->dce_mode //[2]USE | > > > > > > > > The rose_neigh is deallocated in position [1] and use in > > > > position [2]. > > > > > > > > The crash trace triggered by POC is like below: > > > > > > > > BUG: KASAN: use-after-free in expire_timers+0x144/0x320 > > > > Write of size 8 at addr ffff888009b19658 by task swapper/0/0 > > > > ... > > > > Call Trace: > > > > <IRQ> > > > > dump_stack_lvl+0xbf/0xee > > > > print_address_description+0x7b/0x440 > > > > print_report+0x101/0x230 > > > > ? expire_timers+0x144/0x320 > > > > kasan_report+0xed/0x120 > > > > ? expire_timers+0x144/0x320 > > > > expire_timers+0x144/0x320 > > > > __run_timers+0x3ff/0x4d0 > > > > run_timer_softirq+0x41/0x80 > > > > __do_softirq+0x233/0x544 > > > > ... > > > > > > > > This patch changes del_timer() in rose_stop_t0timer() and > > > > rose_stop_ftimer() to del_timer_sync() in order that the > > > > timer handler could be finished before the resources such as > > > > rose_neigh and so on are deallocated. As a result, the UAF > > > > bugs could be mitigated. > > > > > > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > > > > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> > > > > --- > > > > net/rose/rose_link.c | 4 ++-- > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/net/rose/rose_link.c b/net/rose/rose_link.c > > > > index 8b96a56d3a4..9734d1264de 100644 > > > > --- a/net/rose/rose_link.c > > > > +++ b/net/rose/rose_link.c > > > > @@ -54,12 +54,12 @@ static void rose_start_t0timer(struct rose_neigh *neigh) > > > > > > > > void rose_stop_ftimer(struct rose_neigh *neigh) > > > > { > > > > - del_timer(&neigh->ftimer); > > > > + del_timer_sync(&neigh->ftimer); > > > > } > > > > > > Are you sure this is safe ? > > > > > > del_timer_sync() could hang if the caller holds a lock that the timer > > > function would need to acquire. > > > > I think this is safe. The rose_ftimer_expiry() is an empty function that is > > shown below: > > > > static void rose_ftimer_expiry(struct timer_list *t) > > { > > } > > > > > > > > > > void rose_stop_t0timer(struct rose_neigh *neigh) > > > > { > > > > - del_timer(&neigh->t0timer); > > > > + del_timer_sync(&neigh->t0timer); > > > > } > > > > > > Same here, please explain why it is safe. > > > > The rose_stop_t0timer() may hold "rose_node_list_lock" and "rose_neigh_list_lock", > > but the timer handler rose_t0timer_expiry() that is shown below does not need > > these two locks. > > > > static void rose_t0timer_expiry(struct timer_list *t) > > { > > struct rose_neigh *neigh = from_timer(neigh, t, t0timer); > > > > rose_transmit_restart_request(neigh); > > > > neigh->dce_mode = 0; > > > > rose_start_t0timer(neigh); > > This will rearm the timer. del_timer_sync() will not help. Thank you for your time, but I don't think so. > Please read the comment in front of del_timer_sync(), in kernel/time/timer.c I wrote a kernel module to test whether del_timer_sync() could finish a timer handler that use mod_timer() to rewind itself. The following is the result. # insmod del_timer_sync.ko [ 929.374405] my_timer will be create. [ 929.374738] the jiffies is :4295595572 [ 930.411581] In my_timer_function [ 930.411956] the jiffies is 4295596609 [ 935.466643] In my_timer_function [ 935.467505] the jiffies is 4295601665 [ 940.586538] In my_timer_function [ 940.586916] the jiffies is 4295606784 [ 945.706579] In my_timer_function [ 945.706885] the jiffies is 4295611904 # # rmmod del_timer_sync.ko [ 948.507692] the del_timer_sync is :1 [ 948.507692] # # The result of the experiment shows that the timer handler could be killed after we execute del_timer_sync(), even if the timer could rewind itself. Best regards, Duoming Zhou
On Thu, Jun 30, 2022 at 5:51 PM <duoming@zju.edu.cn> wrote: > > Hello, > > On Thu, 30 Jun 2022 17:17:10 +0200 Eric Dumazet wrote: > > > > > > There are UAF bugs caused by rose_t0timer_expiry(). The > > > > > root cause is that del_timer() could not stop the timer > > > > > handler that is running and there is no synchronization. > > > > > One of the race conditions is shown below: > > > > > > > > > > (thread 1) | (thread 2) > > > > > | rose_device_event > > > > > | rose_rt_device_down > > > > > | rose_remove_neigh > > > > > rose_t0timer_expiry | rose_stop_t0timer(rose_neigh) > > > > > ... | del_timer(&neigh->t0timer) > > > > > | kfree(rose_neigh) //[1]FREE > > > > > neigh->dce_mode //[2]USE | > > > > > > > > > > The rose_neigh is deallocated in position [1] and use in > > > > > position [2]. > > > > > > > > > > The crash trace triggered by POC is like below: > > > > > > > > > > BUG: KASAN: use-after-free in expire_timers+0x144/0x320 > > > > > Write of size 8 at addr ffff888009b19658 by task swapper/0/0 > > > > > ... > > > > > Call Trace: > > > > > <IRQ> > > > > > dump_stack_lvl+0xbf/0xee > > > > > print_address_description+0x7b/0x440 > > > > > print_report+0x101/0x230 > > > > > ? expire_timers+0x144/0x320 > > > > > kasan_report+0xed/0x120 > > > > > ? expire_timers+0x144/0x320 > > > > > expire_timers+0x144/0x320 > > > > > __run_timers+0x3ff/0x4d0 > > > > > run_timer_softirq+0x41/0x80 > > > > > __do_softirq+0x233/0x544 > > > > > ... > > > > > > > > > > This patch changes del_timer() in rose_stop_t0timer() and > > > > > rose_stop_ftimer() to del_timer_sync() in order that the > > > > > timer handler could be finished before the resources such as > > > > > rose_neigh and so on are deallocated. As a result, the UAF > > > > > bugs could be mitigated. > > > > > > > > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > > > > > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> > > > > > --- > > > > > net/rose/rose_link.c | 4 ++-- > > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/net/rose/rose_link.c b/net/rose/rose_link.c > > > > > index 8b96a56d3a4..9734d1264de 100644 > > > > > --- a/net/rose/rose_link.c > > > > > +++ b/net/rose/rose_link.c > > > > > @@ -54,12 +54,12 @@ static void rose_start_t0timer(struct rose_neigh *neigh) > > > > > > > > > > void rose_stop_ftimer(struct rose_neigh *neigh) > > > > > { > > > > > - del_timer(&neigh->ftimer); > > > > > + del_timer_sync(&neigh->ftimer); > > > > > } > > > > > > > > Are you sure this is safe ? > > > > > > > > del_timer_sync() could hang if the caller holds a lock that the timer > > > > function would need to acquire. > > > > > > I think this is safe. The rose_ftimer_expiry() is an empty function that is > > > shown below: > > > > > > static void rose_ftimer_expiry(struct timer_list *t) > > > { > > > } > > > > > > > > > > > > > void rose_stop_t0timer(struct rose_neigh *neigh) > > > > > { > > > > > - del_timer(&neigh->t0timer); > > > > > + del_timer_sync(&neigh->t0timer); > > > > > } > > > > > > > > Same here, please explain why it is safe. > > > > > > The rose_stop_t0timer() may hold "rose_node_list_lock" and "rose_neigh_list_lock", > > > but the timer handler rose_t0timer_expiry() that is shown below does not need > > > these two locks. > > > > > > static void rose_t0timer_expiry(struct timer_list *t) > > > { > > > struct rose_neigh *neigh = from_timer(neigh, t, t0timer); > > > > > > rose_transmit_restart_request(neigh); > > > > > > neigh->dce_mode = 0; > > > > > > rose_start_t0timer(neigh); > > > > This will rearm the timer. del_timer_sync() will not help. > > Thank you for your time, but I don't think so. > > > Please read the comment in front of del_timer_sync(), in kernel/time/timer.c > > I wrote a kernel module to test whether del_timer_sync() could finish a timer handler > that use mod_timer() to rewind itself. The following is the result. > > # insmod del_timer_sync.ko > [ 929.374405] my_timer will be create. > [ 929.374738] the jiffies is :4295595572 > [ 930.411581] In my_timer_function > [ 930.411956] the jiffies is 4295596609 > [ 935.466643] In my_timer_function > [ 935.467505] the jiffies is 4295601665 > [ 940.586538] In my_timer_function > [ 940.586916] the jiffies is 4295606784 > [ 945.706579] In my_timer_function > [ 945.706885] the jiffies is 4295611904 > > # > # rmmod del_timer_sync.ko > [ 948.507692] the del_timer_sync is :1 > [ 948.507692] > # > # > > The result of the experiment shows that the timer handler could > be killed after we execute del_timer_sync(), even if the timer could > rewind itself. This is not enough to run an experiment to determine a comment is obsolete. Especially if you are not running the code from interrupts, like rose protocol might... If you think the comment is obsolete, please send a patch to amend it.
Hello, On Thu, 30 Jun 2022 18:07:39 +0200 Eric Dumazet wrote: > > > > > > There are UAF bugs caused by rose_t0timer_expiry(). The > > > > > > root cause is that del_timer() could not stop the timer > > > > > > handler that is running and there is no synchronization. > > > > > > One of the race conditions is shown below: > > > > > > > > > > > > (thread 1) | (thread 2) > > > > > > | rose_device_event > > > > > > | rose_rt_device_down > > > > > > | rose_remove_neigh > > > > > > rose_t0timer_expiry | rose_stop_t0timer(rose_neigh) > > > > > > ... | del_timer(&neigh->t0timer) > > > > > > | kfree(rose_neigh) //[1]FREE > > > > > > neigh->dce_mode //[2]USE | > > > > > > > > > > > > The rose_neigh is deallocated in position [1] and use in > > > > > > position [2]. > > > > > > > > > > > > The crash trace triggered by POC is like below: > > > > > > > > > > > > BUG: KASAN: use-after-free in expire_timers+0x144/0x320 > > > > > > Write of size 8 at addr ffff888009b19658 by task swapper/0/0 > > > > > > ... > > > > > > Call Trace: > > > > > > <IRQ> > > > > > > dump_stack_lvl+0xbf/0xee > > > > > > print_address_description+0x7b/0x440 > > > > > > print_report+0x101/0x230 > > > > > > ? expire_timers+0x144/0x320 > > > > > > kasan_report+0xed/0x120 > > > > > > ? expire_timers+0x144/0x320 > > > > > > expire_timers+0x144/0x320 > > > > > > __run_timers+0x3ff/0x4d0 > > > > > > run_timer_softirq+0x41/0x80 > > > > > > __do_softirq+0x233/0x544 > > > > > > ... > > > > > > > > > > > > This patch changes del_timer() in rose_stop_t0timer() and > > > > > > rose_stop_ftimer() to del_timer_sync() in order that the > > > > > > timer handler could be finished before the resources such as > > > > > > rose_neigh and so on are deallocated. As a result, the UAF > > > > > > bugs could be mitigated. > > > > > > > > > > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > > > > > > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> > > > > > > --- > > > > > > net/rose/rose_link.c | 4 ++-- > > > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > > > > > diff --git a/net/rose/rose_link.c b/net/rose/rose_link.c > > > > > > index 8b96a56d3a4..9734d1264de 100644 > > > > > > --- a/net/rose/rose_link.c > > > > > > +++ b/net/rose/rose_link.c > > > > > > @@ -54,12 +54,12 @@ static void rose_start_t0timer(struct rose_neigh *neigh) > > > > > > > > > > > > void rose_stop_ftimer(struct rose_neigh *neigh) > > > > > > { > > > > > > - del_timer(&neigh->ftimer); > > > > > > + del_timer_sync(&neigh->ftimer); > > > > > > } > > > > > > > > > > Are you sure this is safe ? > > > > > > > > > > del_timer_sync() could hang if the caller holds a lock that the timer > > > > > function would need to acquire. > > > > > > > > I think this is safe. The rose_ftimer_expiry() is an empty function that is > > > > shown below: > > > > > > > > static void rose_ftimer_expiry(struct timer_list *t) > > > > { > > > > } > > > > > > > > > > > > > > > > void rose_stop_t0timer(struct rose_neigh *neigh) > > > > > > { > > > > > > - del_timer(&neigh->t0timer); > > > > > > + del_timer_sync(&neigh->t0timer); > > > > > > } > > > > > > > > > > Same here, please explain why it is safe. > > > > > > > > The rose_stop_t0timer() may hold "rose_node_list_lock" and "rose_neigh_list_lock", > > > > but the timer handler rose_t0timer_expiry() that is shown below does not need > > > > these two locks. > > > > > > > > static void rose_t0timer_expiry(struct timer_list *t) > > > > { > > > > struct rose_neigh *neigh = from_timer(neigh, t, t0timer); > > > > > > > > rose_transmit_restart_request(neigh); > > > > > > > > neigh->dce_mode = 0; > > > > > > > > rose_start_t0timer(neigh); > > > > > > This will rearm the timer. del_timer_sync() will not help. > > > > Thank you for your time, but I don't think so. > > > > > Please read the comment in front of del_timer_sync(), in kernel/time/timer.c > > > > I wrote a kernel module to test whether del_timer_sync() could finish a timer handler > > that use mod_timer() to rewind itself. The following is the result. > > > > # insmod del_timer_sync.ko > > [ 929.374405] my_timer will be create. > > [ 929.374738] the jiffies is :4295595572 > > [ 930.411581] In my_timer_function > > [ 930.411956] the jiffies is 4295596609 > > [ 935.466643] In my_timer_function > > [ 935.467505] the jiffies is 4295601665 > > [ 940.586538] In my_timer_function > > [ 940.586916] the jiffies is 4295606784 > > [ 945.706579] In my_timer_function > > [ 945.706885] the jiffies is 4295611904 > > > > # > > # rmmod del_timer_sync.ko > > [ 948.507692] the del_timer_sync is :1 > > [ 948.507692] > > # > > # > > > > The result of the experiment shows that the timer handler could > > be killed after we execute del_timer_sync(), even if the timer could > > rewind itself. > > > This is not enough to run an experiment to determine a comment is obsolete. > > Especially if you are not running the code from interrupts, like rose > protocol might... I have tested this patch, it could work. In order to further prove the del_timer_sync() could stop the timer that restart itself in its timer handler, I wrote the following kernel module whoes part of code is shown below: ================================================================= struct timer_list my_timer; static void my_timer_callback(struct timer_list *timer); static void start_timer(void); static void start_timer(void){ del_timer(&my_timer); my_timer.expires = jiffies+HZ; my_timer.function = my_timer_callback; add_timer(&my_timer); } static void my_timer_callback(struct timer_list *timer){ printk("In my_timer_function"); printk("the jiffies is %ld\n",jiffies); start_timer(); } static int __init del_timer_sync_init(void) { int result; printk("my_timer will be create.\n"); printk("the jiffies is :%ld\n", jiffies); timer_setup(&my_timer,my_timer_callback,0); result = mod_timer(&my_timer,jiffies + SIXP_TXDELAY); printk("the mod_timer is :%d\n\n",result); return 0; } static void __exit del_timer_sync_exit(void) { int result=del_timer_sync(&my_timer); printk("the del_timer_sync is :%d\n\n", result); } ================================================================= The timer handler is running from interrupts and del_timer_sync() could stop the timer that rewind itself in its timer handler, the result is shown below: # insmod del_timer_sync.ko [ 103.505857] my_timer will be create. [ 103.505922] the jiffies is :4294770832 [ 103.506845] the mod_timer is :0 [ 103.506845] # [ 103.532389] In my_timer_function [ 103.532452] the jiffies is 4294770859 [ 104.576768] In my_timer_function [ 104.577096] the jiffies is 4294771904 [ 105.600941] In my_timer_function [ 105.601072] the jiffies is 4294772928 [ 106.625397] In my_timer_function [ 106.625573] the jiffies is 4294773952 [ 107.648995] In my_timer_function [ 107.649212] the jiffies is 4294774976 [ 108.673037] In my_timer_function [ 108.673787] the jiffies is 4294776001 rmmod del_timer_sync.ko [ 109.649482] the del_timer_sync is :1 [ 109.649482] # The root cause is shown below: do { ret = try_to_del_timer_sync(timer); if (unlikely(ret < 0)) { del_timer_wait_running(timer); cpu_relax(); } } while (ret < 0); https://elixir.bootlin.com/linux/latest/source/kernel/time/timer.c#L1381 If we call another thread such as a work_queue or the code in other places to restart the timer instead of in its timer handler, the del_timer_sync() could not stop it. > If you think the comment is obsolete, please send a patch to amend it. The comment says: * Synchronization rules: Callers must prevent restarting of the timer, * otherwise this function is meaningless. We could restart the timer successfully except for restarting in its timer handler after we call del_timer_sync(). I think changing the comment to the following is better: * Synchronization rules: Callers must prevent restarting of the timer in * other places except for its timer handler, otherwise this function is * meaningless. Best regards, Duoming Zhou
diff --git a/net/rose/rose_link.c b/net/rose/rose_link.c index 8b96a56d3a4..9734d1264de 100644 --- a/net/rose/rose_link.c +++ b/net/rose/rose_link.c @@ -54,12 +54,12 @@ static void rose_start_t0timer(struct rose_neigh *neigh) void rose_stop_ftimer(struct rose_neigh *neigh) { - del_timer(&neigh->ftimer); + del_timer_sync(&neigh->ftimer); } void rose_stop_t0timer(struct rose_neigh *neigh) { - del_timer(&neigh->t0timer); + del_timer_sync(&neigh->t0timer); } int rose_ftimer_running(struct rose_neigh *neigh)
There are UAF bugs caused by rose_t0timer_expiry(). The root cause is that del_timer() could not stop the timer handler that is running and there is no synchronization. One of the race conditions is shown below: (thread 1) | (thread 2) | rose_device_event | rose_rt_device_down | rose_remove_neigh rose_t0timer_expiry | rose_stop_t0timer(rose_neigh) ... | del_timer(&neigh->t0timer) | kfree(rose_neigh) //[1]FREE neigh->dce_mode //[2]USE | The rose_neigh is deallocated in position [1] and use in position [2]. The crash trace triggered by POC is like below: BUG: KASAN: use-after-free in expire_timers+0x144/0x320 Write of size 8 at addr ffff888009b19658 by task swapper/0/0 ... Call Trace: <IRQ> dump_stack_lvl+0xbf/0xee print_address_description+0x7b/0x440 print_report+0x101/0x230 ? expire_timers+0x144/0x320 kasan_report+0xed/0x120 ? expire_timers+0x144/0x320 expire_timers+0x144/0x320 __run_timers+0x3ff/0x4d0 run_timer_softirq+0x41/0x80 __do_softirq+0x233/0x544 ... This patch changes del_timer() in rose_stop_t0timer() and rose_stop_ftimer() to del_timer_sync() in order that the timer handler could be finished before the resources such as rose_neigh and so on are deallocated. As a result, the UAF bugs could be mitigated. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> --- net/rose/rose_link.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)