diff mbox series

[RFC,v2,19/31] timers: net: Use del_timer_shutdown() before freeing timer

Message ID 20221027150928.780676863@goodmis.org (mailing list archive)
State RFC
Delegated to: Kalle Valo
Headers show
Series None | expand

Commit Message

Steven Rostedt Oct. 27, 2022, 3:05 p.m. UTC
From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

Before a timer is freed, del_timer_shutdown() must be called.

Link: https://lore.kernel.org/all/20220407161745.7d6754b3@gandalf.local.home/

Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Mirko Lindner <mlindner@marvell.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: Menglong Dong <imagedong@tencent.com>
Cc: linux-usb@vger.kernel.org
Cc: linux-wireless@vger.kernel.org
Cc: bridge@lists.linux-foundation.org
Cc: netfilter-devel@vger.kernel.org
Cc: coreteam@netfilter.org
Cc: lvs-devel@vger.kernel.org
Cc: linux-afs@lists.infradead.org
Cc: linux-nfs@vger.kernel.org
Cc: tipc-discussion@lists.sourceforge.net
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c      | 6 +++---
 drivers/net/ethernet/marvell/sky2.c              | 2 +-
 drivers/net/ethernet/sun/sunvnet.c               | 2 +-
 drivers/net/usb/sierra_net.c                     | 2 +-
 drivers/net/wireless/intel/iwlwifi/iwl-dbg-tlv.c | 2 +-
 drivers/net/wireless/intersil/hostap/hostap_ap.c | 2 +-
 drivers/net/wireless/marvell/mwifiex/main.c      | 2 +-
 drivers/net/wireless/microchip/wilc1000/hif.c    | 8 ++++----
 net/802/garp.c                                   | 2 +-
 net/802/mrp.c                                    | 2 +-
 net/bridge/br_multicast.c                        | 6 +++---
 net/bridge/br_multicast_eht.c                    | 4 ++--
 net/core/gen_estimator.c                         | 2 +-
 net/core/sock.c                                  | 2 +-
 net/ipv4/inet_timewait_sock.c                    | 2 +-
 net/ipv4/ipmr.c                                  | 2 +-
 net/ipv6/ip6mr.c                                 | 2 +-
 net/mac80211/mesh_pathtbl.c                      | 2 +-
 net/netfilter/ipset/ip_set_list_set.c            | 2 +-
 net/netfilter/ipvs/ip_vs_lblc.c                  | 2 +-
 net/netfilter/ipvs/ip_vs_lblcr.c                 | 2 +-
 net/netfilter/xt_LED.c                           | 2 +-
 net/rxrpc/conn_object.c                          | 2 +-
 net/sched/cls_flow.c                             | 2 +-
 net/sunrpc/svc.c                                 | 2 +-
 net/tipc/discover.c                              | 2 +-
 net/tipc/monitor.c                               | 2 +-
 27 files changed, 35 insertions(+), 35 deletions(-)

Comments

Steven Rostedt Oct. 27, 2022, 7:55 p.m. UTC | #1
On Thu, 27 Oct 2022 12:38:16 -0700
Guenter Roeck <linux@roeck-us.net> wrote:

> On 10/27/22 12:27, Steven Rostedt wrote:
> > On Thu, 27 Oct 2022 15:20:58 -0400
> > Steven Rostedt <rostedt@goodmis.org> wrote:
> >   
> >>> (many more of those)
> >>> ...
> >>> [   16.329989]  timer_fixup_free+0x40/0x54  
> >>
> >> Ah, I see the issue here. Looks like the timer_fixup_free() is calling
> >> itself and crashing.
> >>
> >> Let me take a look into that. I didn't touch the fixup code, and there
> >> could be an assumption there that it's behaving with the old approach.  
> > 
> > Can you add this and see if it makes this issue go away?
> >   
> 
> Yes, that fixes the crash. However, it still reports
> 
> [   12.235054] ------------[ cut here ]------------
> [   12.235240] ODEBUG: free active (active state 0) object type: timer_list hint: tcp_write_timer+0x0/0x190
> [   12.237331] WARNING: CPU: 0 PID: 310 at lib/debugobjects.c:502 debug_print_object+0xb8/0x100
> ...
> [   12.255251] Call trace:
> [   12.255305]  debug_print_object+0xb8/0x100
> [   12.255385]  __debug_check_no_obj_freed+0x1d0/0x25c
> [   12.255474]  debug_check_no_obj_freed+0x20/0x90
> [   12.255555]  slab_free_freelist_hook.constprop.0+0xac/0x1b0
> [   12.255650]  kmem_cache_free+0x1ac/0x500
> [   12.255728]  __sk_destruct+0x140/0x2a0
> [   12.255805]  sk_destruct+0x54/0x64
> [   12.255877]  __sk_free+0x74/0x120
> [   12.255944]  sk_free+0x64/0x8c
> [   12.256009]  tcp_close+0x94/0xc0
> [   12.256076]  inet_release+0x50/0xb0
> [   12.256145]  __sock_release+0x44/0xbc
> [   12.256219]  sock_close+0x18/0x30
> [   12.256292]  __fput+0x84/0x270
> [   12.256361]  ____fput+0x10/0x20
> [   12.256426]  task_work_run+0x88/0xf0
> [   12.256499]  do_exit+0x334/0xafc
> [   12.256566]  do_group_exit+0x34/0x90
> [   12.256634]  __arm64_sys_exit_group+0x18/0x20
> [   12.256713]  invoke_syscall+0x48/0x114
> [   12.256789]  el0_svc_common.constprop.0+0x60/0x11c
> [   12.256874]  do_el0_svc+0x30/0xd0
> [   12.256943]  el0_svc+0x48/0xc0
> [   12.257008]  el0t_64_sync_handler+0xbc/0x13c
> [   12.257086]  el0t_64_sync+0x18c/0x190
> 
> Is that a real problem or a false positive ? I didn't see that
> without your patch series (which of course might be the whole point
> of the series).
> 

I think this is indeed an issue, and I'm replying to the net patch as it
has the necessary folks Cc'd.

The ipv4 tcp code has:

void tcp_init_xmit_timers(struct sock *sk)
{
	inet_csk_init_xmit_timers(sk, &tcp_write_timer, &tcp_delack_timer,
				  &tcp_keepalive_timer);

And from the above back trace:

tcp_close() where I'm assuming that tcp_disconnect() or tcp_done() was
called that both calls:

  tcp_clear_xmit_timers(sk);

That calls:

	inet_csk_clear_xmit_timers(sk);

That has:

void inet_csk_clear_xmit_timers(struct sock *sk)
{
	struct inet_connection_sock *icsk = inet_csk(sk);

	icsk->icsk_pending = icsk->icsk_ack.pending = 0;

	sk_stop_timer(sk, &icsk->icsk_retransmit_timer);
	sk_stop_timer(sk, &icsk->icsk_delack_timer);
	sk_stop_timer(sk, &sk->sk_timer);
}

Where:

void sk_stop_timer(struct sock *sk, struct timer_list* timer)
{
	if (del_timer(timer))
		__sock_put(sk);
}


Hence, this is a case where we have timers that have been disabled with
only del_timer() before the timers are freed.

I think we need to update this code to squeeze in a del_timer_shutdown() to
make sure that the timers are never restarted.

There is a sk_stop_timer_sync() that I changed to use del_timer_shutdown()
but that's only used in one file: net/mptcp/pm_netlink.c

-- Steve
Linus Torvalds Oct. 27, 2022, 8:15 p.m. UTC | #2
On Thu, Oct 27, 2022 at 12:55 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> I think we need to update this code to squeeze in a del_timer_shutdown() to
> make sure that the timers are never restarted.

So the reason the networking code does this is that it can't just do
the old 'sync()' thing, the timers are deleted in contexts where that
isn't valid.

Which is also afaik why the networking code does that whole "timer
implies a refcount to the socket" and then does the

    if (del_timer(timer))
           sock_put()

thing (ie if the del_timer failed - possibly because it was already
running - you leave the refcount alone).

So the networking code cannot do the del_timer_shutdown() for the same
reason it cannot do the del_timer_sync(): it can't afford to wait for
the timer to stop running.

I suspect it needs something like a new "del_timer_shutdown_async()"
that isn't synchronous, but does that

 - acts as del_timer in that it doesn't wait, and returns a success if
it could just remove the pending case

 - does that "mark timer for shutdown" in that success case

or something similar.

But the networking people will know better.

               Linus
Steven Rostedt Oct. 27, 2022, 8:34 p.m. UTC | #3
On Thu, 27 Oct 2022 13:15:23 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Thu, Oct 27, 2022 at 12:55 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > I think we need to update this code to squeeze in a del_timer_shutdown() to
> > make sure that the timers are never restarted.  
> 
> So the reason the networking code does this is that it can't just do
> the old 'sync()' thing, the timers are deleted in contexts where that
> isn't valid.
> 
> Which is also afaik why the networking code does that whole "timer
> implies a refcount to the socket" and then does the
> 
>     if (del_timer(timer))
>            sock_put()
> 
> thing (ie if the del_timer failed - possibly because it was already
> running - you leave the refcount alone).

OK, so the above is assuming that the timer is always active, and
del_timer() returns if it successfully removed it (where it can call
sock_put()), but if del_timer() returns 0, that means the timer is
currently running (or about to be), so it doesn't call sock_put().

> 
> So the networking code cannot do the del_timer_shutdown() for the same
> reason it cannot do the del_timer_sync(): it can't afford to wait for
> the timer to stop running.
> 
> I suspect it needs something like a new "del_timer_shutdown_async()"
> that isn't synchronous, but does that
> 
>  - acts as del_timer in that it doesn't wait, and returns a success if
> it could just remove the pending case
> 
>  - does that "mark timer for shutdown" in that success case
> 
> or something similar.
>

What about del_timer_try_shutdown(), that if it removes the timer, it sets
the function to NULL (making it equivalent to a successful shutdown),
otherwise it does nothing. Allowing the the timer to be rearmed.

I think this would work in this case.

-- Steve
Linus Torvalds Oct. 27, 2022, 8:48 p.m. UTC | #4
On Thu, Oct 27, 2022 at 1:34 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> What about del_timer_try_shutdown(), that if it removes the timer, it sets
> the function to NULL (making it equivalent to a successful shutdown),
> otherwise it does nothing. Allowing the the timer to be rearmed.

Sounds sane to me and should work, but as mentioned, I think the
networking people need to say "yeah" too.

And maybe that function can also disallow any future re-arming even
for the case where the timer couldn't be actively removed.

So any *currently* active timer wouldn't be waited for (either because
locking may make that a deadlock situation, or simply due to
performance issues), but at least it would guarantee that no new timer
activations can happen.

Because I do like the whole notion of "timer has been shutdown and
cannot be used as a timer any more without re-initializing it" being a
real state - even for a timer that may be "currently in flight".

So this all sounds very worthwhile to me, but I'm not surprised that
we have code that then knows about all the subtleties of "del_timer()
might still have a running timer" and actually take advantage of it
(where "advantage" is likely more of a "deal with the complexities"
rather than anything really positive ;)

And those existing subtle users might want particular semantics to at
least make said complexities easier.

               Linus
Steven Rostedt Oct. 27, 2022, 9:07 p.m. UTC | #5
On Thu, 27 Oct 2022 13:48:54 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Thu, Oct 27, 2022 at 1:34 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > What about del_timer_try_shutdown(), that if it removes the timer, it sets
> > the function to NULL (making it equivalent to a successful shutdown),
> > otherwise it does nothing. Allowing the the timer to be rearmed.  
> 
> Sounds sane to me and should work, but as mentioned, I think the
> networking people need to say "yeah" too.
> 
> And maybe that function can also disallow any future re-arming even
> for the case where the timer couldn't be actively removed.

Well, I think this current use case will break if we prevent the timer from
being rearmed or run again if it's not found. But as you said, the
networking folks need to confirm or deny it.

The fact that it does the sock_put() when it removes the timer makes me
think that it can be called again, and we shouldn't prevent that from
happening.

The debug code will let us know too, as it only "frees" it for freeing if
it deactivated the timer and shut it down.

> 
> So any *currently* active timer wouldn't be waited for (either because
> locking may make that a deadlock situation, or simply due to
> performance issues), but at least it would guarantee that no new timer
> activations can happen.
> 
> Because I do like the whole notion of "timer has been shutdown and
> cannot be used as a timer any more without re-initializing it" being a
> real state - even for a timer that may be "currently in flight".
> 
> So this all sounds very worthwhile to me, but I'm not surprised that
> we have code that then knows about all the subtleties of "del_timer()
> might still have a running timer" and actually take advantage of it
> (where "advantage" is likely more of a "deal with the complexities"
> rather than anything really positive ;)

Good to hear. This has been a thorn in our side as we keep hitting these
crashes in the timer code that look like a timer was freed before it
triggered.

> 
> And those existing subtle users might want particular semantics to at
> least make said complexities easier.
> 

Yeah, as someone told me recently, "If you let them play long enough without
setting out the rules, they will take advantage of everything and it will be
extremely hard to get them back in order".

-- Steve
Steven Rostedt Oct. 27, 2022, 9:07 p.m. UTC | #6
On Thu, 27 Oct 2022 16:34:53 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> What about del_timer_try_shutdown(), that if it removes the timer, it sets
> the function to NULL (making it equivalent to a successful shutdown),
> otherwise it does nothing. Allowing the the timer to be rearmed.
> 
> I think this would work in this case.

Guenter,

Can you apply this patch on top of the series, and see if it makes the
warning go away?

diff --git a/include/linux/timer.h b/include/linux/timer.h
index d4d90149d015..e3c5f4bdd526 100644
--- a/include/linux/timer.h
+++ b/include/linux/timer.h
@@ -184,12 +184,23 @@ static inline int timer_pending(const struct timer_list * timer)
 	return !hlist_unhashed_lockless(&timer->entry);
 }
 
+extern int __del_timer(struct timer_list * timer, bool free);
+
 extern void add_timer_on(struct timer_list *timer, int cpu);
-extern int del_timer(struct timer_list * timer);
 extern int mod_timer(struct timer_list *timer, unsigned long expires);
 extern int mod_timer_pending(struct timer_list *timer, unsigned long expires);
 extern int timer_reduce(struct timer_list *timer, unsigned long expires);
 
+static inline int del_timer_try_shutdown(struct timer_list *timer)
+{
+	return __del_timer(timer, true);
+}
+
+static inline int del_timer(struct timer_list *timer)
+{
+	return __del_timer(timer, false);
+}
+
 /*
  * The jiffies value which is added to now, when there is no timer
  * in the timer wheel:
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 7305c65ad0eb..073031cb3bb9 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1255,7 +1255,7 @@ EXPORT_SYMBOL_GPL(add_timer_on);
  * (ie. del_timer() of an inactive timer returns 0, del_timer() of an
  * active timer returns 1.)
  */
-int del_timer(struct timer_list *timer)
+int __del_timer(struct timer_list *timer, bool free)
 {
 	struct timer_base *base;
 	unsigned long flags;
@@ -1266,12 +1266,16 @@ int del_timer(struct timer_list *timer)
 	if (timer_pending(timer)) {
 		base = lock_timer_base(timer, &flags);
 		ret = detach_if_pending(timer, base, true);
+		if (free && ret) {
+			timer->function = NULL;
+			debug_timer_deactivate(timer);
+		}
 		raw_spin_unlock_irqrestore(&base->lock, flags);
 	}
 
 	return ret;
 }
-EXPORT_SYMBOL(del_timer);
+EXPORT_SYMBOL(__del_timer);
 
 static int __try_to_del_timer_sync(struct timer_list *timer, bool free)
 {
diff --git a/net/core/sock.c b/net/core/sock.c
index 10cc84379d75..23a97442a0a6 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3345,7 +3345,7 @@ EXPORT_SYMBOL(sk_reset_timer);
 
 void sk_stop_timer(struct sock *sk, struct timer_list* timer)
 {
-	if (del_timer(timer))
+	if (del_timer_try_shutdown(timer))
 		__sock_put(sk);
 }
 EXPORT_SYMBOL(sk_stop_timer);
Steven Rostedt Oct. 27, 2022, 9:15 p.m. UTC | #7
On Thu, 27 Oct 2022 17:07:20 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> > And maybe that function can also disallow any future re-arming even
> > for the case where the timer couldn't be actively removed.  

The naming of the functions will depend on this.

If the async version always shuts down the timer, then we should have the
interface be:

	del_timer_shutdown() <- async

	del_timer_shutdown_sync <- sync

As it would match the del_timer() and del_timer_sync() semantics.

If shutdown only happens if the timer is removed, then I believe the
current approach of del_timer_shutdown() being synchronous and
del_timer_try_shutdown() being async is the way to go, as it follows more
the semantics of mutex_lock() and mutex_trylock().

-- Steve
Steven Rostedt Oct. 27, 2022, 10:35 p.m. UTC | #8
On Thu, 27 Oct 2022 17:07:20 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> Well, I think this current use case will break if we prevent the timer from
> being rearmed or run again if it's not found. But as you said, the
> networking folks need to confirm or deny it.
> 
> The fact that it does the sock_put() when it removes the timer makes me
> think that it can be called again, and we shouldn't prevent that from
> happening.
> 
> The debug code will let us know too, as it only "frees" it for freeing if
> it deactivated the timer and shut it down.

I think we have our answer from Guenter's report:


Linux version 6.1.0-rc2-00138-gced58c742836 (groeck@jupiter) (aarch64-linux-gcc (GCC) 11.3.0, GNU ld (GNU Binutils) 2.39) #1 SMP PREEMPT Thu Oct 27 14:53:17 PDT 2022
[   17.258727] ------------[ cut here ]------------
[   17.259079] ODEBUG: free active (active state 0) object type: timer_list hint: tcp_write_timer+0x0/0x190
[   17.259723] WARNING: CPU: 0 PID: 309 at lib/debugobjects.c:502 debug_print_object+0xb8/0x100
[   17.259951] Modules linked in:
[   17.260249] CPU: 0 PID: 309 Comm: telnet Tainted: G                 N 6.1.0-rc2-00138-gced58c742836 #1
[   17.260518] Hardware name: linux,dummy-virt (DT)
[   17.260779] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   17.260967] pc : debug_print_object+0xb8/0x100
[   17.261096] lr : debug_print_object+0xb8/0x100
[   17.261223] sp : ffff8000086539e0
[   17.261324] x29: ffff8000086539e0 x28: 0000000000000004 x27: ffff0d2ac2168000
[   17.261574] x26: 0000000000000000 x25: ffffa241e2b9de18 x24: ffffa241e4f8fcd8
[   17.261772] x23: ffffa241e336b370 x22: ffffa241e2b9de18 x21: ffff0d2ac20c5710
[   17.261967] x20: ffffa241e4ea2568 x19: ffffa241e3ea3c00 x18: 00000000ffffffff
[   17.262161] x17: 6c6973742068696e x16: 3a2074696d65725f x15: 6563742074797065
[   17.262375] x14: 65203029206f626a x13: ffffa241e3ec7640 x12: 0000000000000d50
[   17.262591] x11: 0000000000000470 x10: ffffa241e3f1f640 x9 : ffffa241e3ec7640
[   17.262821] x8 : 00000000ffffefff x7 : ffffa241e3f1f640 x6 : 0000000000000000
[   17.263028] x5 : ffff0d2adfebba68 x4 : 0000000000000000 x3 : 0000000000000027
[   17.263235] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0d2ac658b340
[   17.263528] Call trace:
[   17.263646]  debug_print_object+0xb8/0x100
[   17.263795]  __debug_check_no_obj_freed+0x1d0/0x25c
[   17.263927]  debug_check_no_obj_freed+0x20/0x90
[   17.264051]  slab_free_freelist_hook.constprop.0+0xac/0x1b0
[   17.264197]  kmem_cache_free+0x1ac/0x500
[   17.264311]  __sk_destruct+0x140/0x2a0
[   17.264425]  sk_destruct+0x54/0x64
[   17.264531]  __sk_free+0x74/0x120
[   17.264636]  sk_free+0x64/0x8c
[   17.264736]  tcp_close+0x94/0xc0
[   17.264840]  inet_release+0x50/0xb0
[   17.264949]  __sock_release+0x44/0xbc
[   17.265061]  sock_close+0x18/0x30
[   17.265166]  __fput+0x84/0x270
[   17.265266]  ____fput+0x10/0x20
[   17.265366]  task_work_run+0x88/0xf0
[   17.265491]  do_exit+0x334/0xafc
[   17.265596]  do_group_exit+0x34/0x90
[   17.265705]  __arm64_sys_exit_group+0x18/0x20
[   17.265826]  invoke_syscall+0x48/0x114
[   17.265941]  el0_svc_common.constprop.0+0x60/0x11c
[   17.266070]  do_el0_svc+0x30/0xd0
[   17.266175]  el0_svc+0x48/0xc0
[   17.266276]  el0t_64_sync_handler+0xbc/0x13c
[   17.266396]  el0t_64_sync+0x18c/0x190
[   17.266565] irq event stamp: 5192
[   17.266676] hardirqs last  enabled at (5191): [<ffffa241e1926a18>] __up_console_sem+0x78/0x84
[   17.266903] hardirqs last disabled at (5192): [<ffffa241e2b4d504>] el1_dbg+0x24/0x90
[   17.267093] softirqs last  enabled at (5170): [<ffffa241e181050c>] __do_softirq+0x46c/0x5d8
[   17.267305] softirqs last disabled at (5163): [<ffffa241e1816750>] ____do_softirq+0x10/0x20
[   17.267506] ---[ end trace 0000000000000000 ]---
[   17.275715] ------------[ cut here ]------------

I'll go modify that code to make it shutdown even if it returns zero.

I thinks this means we'll need to change the name to:

 del_timer_shutdown()
 del_timer_shutdown_sync()

But I want to confirm this first.

-- Steve
Guenter Roeck Oct. 28, 2022, 3:16 p.m. UTC | #9
On 10/27/22 14:07, Steven Rostedt wrote:
> On Thu, 27 Oct 2022 16:34:53 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
>> What about del_timer_try_shutdown(), that if it removes the timer, it sets
>> the function to NULL (making it equivalent to a successful shutdown),
>> otherwise it does nothing. Allowing the the timer to be rearmed.
>>
>> I think this would work in this case.
> 
> Guenter,
> 
> Can you apply this patch on top of the series, and see if it makes the
> warning go away?
> 

That patch not only helps, it also fixes the crash seen with openrisc.
For that crash, I was able to collect some useful data; see the log below.

Thanks,
Guenter

---
WARNING: CPU: 0 PID: 7 at lib/debugobjects.c:502 debug_print_object+0xc0/0xe8
ODEBUG: free active (active state 0) object type: timer_list hint: rcu_lock_map+0x0/0x14
Modules linked in:
CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 6.1.0-rc2-00145-g2c4e85e9ac93 #1
Call trace:
[<048ecc8e>] dump_stack_lvl+0x44/0x80
[<c6a7029c>] dump_stack+0x1c/0x2c
[<b225e4eb>] __warn+0xdc/0x118
[<1070b766>] ? debug_print_object+0xc0/0xe8
[<57923a76>] warn_slowpath_fmt+0x78/0x90
[<1070b766>] debug_print_object+0xc0/0xe8
[<b3abbcb0>] __debug_check_no_obj_freed+0x230/0x2b8
[<508d9b5a>] ? delayed_put_task_struct+0x0/0x84
[<30f5a2a0>] ? _s_kernel_ro+0x0/0x200
[<403ab082>] debug_check_no_obj_freed+0x30/0x40
[<82702c56>] free_pcp_prepare+0xc4/0x2b0
[<508d9b5a>] ? delayed_put_task_struct+0x0/0x84
[<7798b190>] free_unref_page+0x44/0x210
[<d73717e5>] __free_pages+0x108/0x124
[<a32de4eb>] slob_free_pages+0x9c/0xac
[<bd51c171>] slob_free+0x40c/0x62c
[<a2d26e0e>] ? thread_stack_free_rcu+0x0/0x44
[<24b2df6c>] ? rcu_process_callbacks+0x114/0x224
[<24b2df6c>] ? rcu_process_callbacks+0x114/0x224
[<7794ec75>] ? rcu_process_callbacks+0xdc/0x224
[<7794ec75>] ? rcu_process_callbacks+0xdc/0x224
[<d76fe88f>] kmem_cache_free+0x64/0xa0
[<46d25dac>] free_task+0x7c/0xe0
[<2df25813>] __put_task_struct+0xe8/0x194
[<64f9675b>] delayed_put_task_struct+0x58/0x84
[<8755437e>] rcu_process_callbacks+0xf0/0x224
[<24b2df6c>] ? rcu_process_callbacks+0x114/0x224
[<020db442>] ? rcu_process_callbacks+0x178/0x224
[<87626af4>] __do_softirq+0x11c/0x2f8
[<c3f89a50>] ? smpboot_thread_fn+0x4c/0x304
[<c3f89a50>] ? smpboot_thread_fn+0x4c/0x304
[<021b0175>] ? smpboot_thread_fn+0x188/0x304
[<f2e79ebd>] ? smpboot_thread_fn+0x158/0x304
[<966be0e6>] run_ksoftirqd+0x4c/0x80
[<4bf65f60>] smpboot_thread_fn+0x180/0x304
[<3f914d93>] ? _raw_spin_unlock_irqrestore+0x50/0x84
[<bef37779>] ? __kthread_parkme+0x60/0xdc
[<b0798e10>] ? smpboot_thread_fn+0x0/0x304
[<c463cd92>] kthread+0x11c/0x144
[<3eaef0b7>] ? kthread+0x0/0x144
[<ef2f6228>] ret_from_fork+0x1c/0x84
---[ end trace 0000000000000000 ]---
Unable to handle kernel access
  at virtual address 0xbd6ed6a4

Oops#: 0000
CPU #: 0
    PC: c0056c78    SR: 00008679    SP: c1027c24
GPR00: 00000000 GPR01: c1027c24 GPR02: c1027c78 GPR03: 00008279
GPR04: 00000000 GPR05: 00000000 GPR06: 00000000 GPR07: 00000001
GPR08: 00000000 GPR09: c0056c64 GPR10: c1026000 GPR11: 00000000
GPR12: 00000000 GPR13: 00000001 GPR14: c05c0000 GPR15: 00000000
GPR16: 00000001 GPR17: bd6ed6a4 GPR18: ff4517b0 GPR19: fd145f00
GPR20: 00000000 GPR21: 00000000 GPR22: 00000000 GPR23: c0760000
GPR24: c10232a0 GPR25: 00000003 GPR26: 00000000 GPR27: 00000000
GPR28: c1a00458 GPR29: 00000000 GPR30: c0790000 GPR31: 00000000
   RES: 00000000 oGPR11: ffffffff
Process ksoftirqd/0 (pid: 7, stackpage=c10232a0)

Stack:
Call trace:
[<6ce5cfad>] __lock_acquire.constprop.0+0xa8/0x914
[<4bc14e12>] ? __del_timer_sync+0x0/0x128
[<da915c87>] lock_acquire.part.0.isra.0+0xd4/0x1ac
[<4bc14e12>] ? __del_timer_sync+0x0/0x128
[<9b341df3>] lock_acquire+0x2c/0x44
[<233b5cbc>] __del_timer_sync+0x64/0x128
[<4bc14e12>] ? __del_timer_sync+0x0/0x128
[<05cd2741>] timer_fixup_free+0x34/0x5c
[<3fa496ad>] __debug_check_no_obj_freed+0x250/0x2b8
[<508d9b5a>] ? delayed_put_task_struct+0x0/0x84
[<30f5a2a0>] ? _s_kernel_ro+0x0/0x200
[<403ab082>] debug_check_no_obj_freed+0x30/0x40
[<82702c56>] free_pcp_prepare+0xc4/0x2b0
[<508d9b5a>] ? delayed_put_task_struct+0x0/0x84
[<7798b190>] free_unref_page+0x44/0x210
[<d73717e5>] __free_pages+0x108/0x124
[<a32de4eb>] slob_free_pages+0x9c/0xac
[<bd51c171>] slob_free+0x40c/0x62c
[<a2d26e0e>] ? thread_stack_free_rcu+0x0/0x44
[<24b2df6c>] ? rcu_process_callbacks+0x114/0x224
[<24b2df6c>] ? rcu_process_callbacks+0x114/0x224
[<7794ec75>] ? rcu_process_callbacks+0xdc/0x224
[<7794ec75>] ? rcu_process_callbacks+0xdc/0x224
[<d76fe88f>] kmem_cache_free+0x64/0xa0
[<46d25dac>] free_task+0x7c/0xe0
[<2df25813>] __put_task_struct+0xe8/0x194
[<64f9675b>] delayed_put_task_struct+0x58/0x84
[<8755437e>] rcu_process_callbacks+0xf0/0x224
[<24b2df6c>] ? rcu_process_callbacks+0x114/0x224
[<020db442>] ? rcu_process_callbacks+0x178/0x224
[<87626af4>] __do_softirq+0x11c/0x2f8
[<c3f89a50>] ? smpboot_thread_fn+0x4c/0x304
[<c3f89a50>] ? smpboot_thread_fn+0x4c/0x304
[<021b0175>] ? smpboot_thread_fn+0x188/0x304
[<f2e79ebd>] ? smpboot_thread_fn+0x158/0x304
[<966be0e6>] run_ksoftirqd+0x4c/0x80
[<4bf65f60>] smpboot_thread_fn+0x180/0x304
[<3f914d93>] ? _raw_spin_unlock_irqrestore+0x50/0x84
[<bef37779>] ? __kthread_parkme+0x60/0xdc
[<b0798e10>] ? smpboot_thread_fn+0x0/0x304
[<c463cd92>] kthread+0x11c/0x144
[<3eaef0b7>] ? kthread+0x0/0x144
[<ef2f6228>] ret_from_fork+0x1c/0x84
Steven Rostedt Oct. 28, 2022, 10:31 p.m. UTC | #10
Could someone from networking confirm (or deny) that the timer being
removed in sk_stop_timer() will no longer be used even if del_timer()
returns false?

net/core/sock.c:

void sk_stop_timer(struct sock *sk, struct timer_list* timer)
{
	if (del_timer(timer))
		__sock_put(sk);
}

If this is the case, then I'll add the following interface:

   del_timer_sync_shutdown() // the common case which syncs

   del_timer_shutdown() // the uncommon case, that returns immediately
                        // used for those cases that add extra code to
                        // handle it, like sk_stop_timer()


Which has the same semantics as del_timer_sync() and del_timer()
respectively, but will prevent the timer from being rearmed again.

This way we can convert the sk_stop_timer() to:

void sk_stop_timer(struct sock *sk, struct timer_list* timer)
{
	if (del_timer_shutdown(timer))
		__sock_put(sk);
}


We can also add the del_timer_shutdown() to other locations that need to
put a timer into a shutdown state before freeing, and where it's in a
context that can not call del_timer_sync_shutdown().

-- Steve
Jakub Kicinski Oct. 28, 2022, 10:46 p.m. UTC | #11
On Fri, 28 Oct 2022 18:31:49 -0400 Steven Rostedt wrote:
> Could someone from networking confirm (or deny) that the timer being
> removed in sk_stop_timer() will no longer be used even if del_timer()
> returns false?
> 
> net/core/sock.c:
> 
> void sk_stop_timer(struct sock *sk, struct timer_list* timer)
> {
> 	if (del_timer(timer))
> 		__sock_put(sk);
> }
> 
> If this is the case, then I'll add the following interface:
> 
>    del_timer_sync_shutdown() // the common case which syncs
> 
>    del_timer_shutdown() // the uncommon case, that returns immediately
>                         // used for those cases that add extra code to
>                         // handle it, like sk_stop_timer()

Sorry too many bugs at once :)

FWIW Paolo was saying privately earlier today that he spotted some cases
of reuse, he gave an example of ccid2_hc_tx_packet_recv()

So we can't convert all cases of sk_stop_timer() in one fell swoop :(

> Which has the same semantics as del_timer_sync() and del_timer()
> respectively, but will prevent the timer from being rearmed again.
> 
> This way we can convert the sk_stop_timer() to:
> 
> void sk_stop_timer(struct sock *sk, struct timer_list* timer)
> {
> 	if (del_timer_shutdown(timer))
> 		__sock_put(sk);
> }
> 
> 
> We can also add the del_timer_shutdown() to other locations that need to
> put a timer into a shutdown state before freeing, and where it's in a
> context that can not call del_timer_sync_shutdown().
Paolo Abeni Oct. 30, 2022, 5:22 p.m. UTC | #12
On Fri, 2022-10-28 at 15:46 -0700, Jakub Kicinski wrote:
> On Fri, 28 Oct 2022 18:31:49 -0400 Steven Rostedt wrote:
> > Could someone from networking confirm (or deny) that the timer being
> > removed in sk_stop_timer() will no longer be used even if del_timer()
> > returns false?
> > 
> > net/core/sock.c:
> > 
> > void sk_stop_timer(struct sock *sk, struct timer_list* timer)
> > {
> > 	if (del_timer(timer))
> > 		__sock_put(sk);
> > }
> > 
> > If this is the case, then I'll add the following interface:
> > 
> >    del_timer_sync_shutdown() // the common case which syncs
> > 
> >    del_timer_shutdown() // the uncommon case, that returns immediately
> >                         // used for those cases that add extra code to
> >                         // handle it, like sk_stop_timer()
> 
> Sorry too many bugs at once :)
> 
> FWIW Paolo was saying privately earlier today that he spotted some cases
> of reuse, he gave an example of ccid2_hc_tx_packet_recv()

For the records, there are other cases, e.g. after sk_stop_timer() in 
clear_3rdack_retransmission() (mptcp code) the timer can be-rearmed
without re-initializing. I *think* there are more of such use in the 
in ax25/rose code.

> So we can't convert all cases of sk_stop_timer() in one fell swoop :(

On the positive side, I think converting the sk_stop_timer in 
inet_csk_clear_xmit_timers() should be safe and should cover the issue
reported by Guenter

Cheers,

Paolo
Steven Rostedt Nov. 3, 2022, 9:51 p.m. UTC | #13
On Sun, 30 Oct 2022 18:22:03 +0100
Paolo Abeni <pabeni@redhat.com> wrote:

> On the positive side, I think converting the sk_stop_timer in 
> inet_csk_clear_xmit_timers() should be safe and should cover the issue
> reported by Guenter

Would something like this be OK? 

[ Note, talking with Thomas Gleixner, we agreed that we are changing the
  name to: time_shutdown_sync() and timer_shutdown() (no wait version).
  I'll be posting new patches soon. ]

-- Steve

diff --git a/include/net/sock.h b/include/net/sock.h
index 22f8bab583dd..0ef58697d4e5 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -2439,6 +2439,8 @@ void sk_stop_timer(struct sock *sk, struct timer_list *timer);
 
 void sk_stop_timer_sync(struct sock *sk, struct timer_list *timer);
 
+void sk_shutdown_timer(struct sock *sk, struct timer_list *timer);
+
 int __sk_queue_drop_skb(struct sock *sk, struct sk_buff_head *sk_queue,
 			struct sk_buff *skb, unsigned int flags,
 			void (*destructor)(struct sock *sk,
diff --git a/net/core/sock.c b/net/core/sock.c
index a3ba0358c77c..82124862b594 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3357,6 +3357,13 @@ void sk_stop_timer_sync(struct sock *sk, struct timer_list *timer)
 }
 EXPORT_SYMBOL(sk_stop_timer_sync);
 
+void sk_shutdown_timer(struct sock *sk, struct timer_list* timer)
+{
+	if (timer_shutdown(timer))
+		__sock_put(sk);
+}
+EXPORT_SYMBOL(sk_shutdown_timer);
+
 void sock_init_data(struct socket *sock, struct sock *sk)
 {
 	sk_init_common(sk);
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 5e70228c5ae9..71f398f51958 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -722,15 +722,15 @@ void inet_csk_clear_xmit_timers(struct sock *sk)
 
 	icsk->icsk_pending = icsk->icsk_ack.pending = 0;
 
-	sk_stop_timer(sk, &icsk->icsk_retransmit_timer);
-	sk_stop_timer(sk, &icsk->icsk_delack_timer);
-	sk_stop_timer(sk, &sk->sk_timer);
+	sk_shutdown_timer(sk, &icsk->icsk_retransmit_timer);
+	sk_shutdown_timer(sk, &icsk->icsk_delack_timer);
+	sk_shutdown_timer(sk, &sk->sk_timer);
 }
 EXPORT_SYMBOL(inet_csk_clear_xmit_timers);
 
 void inet_csk_delete_keepalive_timer(struct sock *sk)
 {
-	sk_stop_timer(sk, &sk->sk_timer);
+	sk_shutdown_timer(sk, &sk->sk_timer);
 }
 EXPORT_SYMBOL(inet_csk_delete_keepalive_timer);
Eric Dumazet Nov. 4, 2022, midnight UTC | #14
On Thu, Nov 3, 2022 at 2:51 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Sun, 30 Oct 2022 18:22:03 +0100
> Paolo Abeni <pabeni@redhat.com> wrote:
>
> > On the positive side, I think converting the sk_stop_timer in
> > inet_csk_clear_xmit_timers() should be safe and should cover the issue
> > reported by Guenter
>
> Would something like this be OK?
>
> [ Note, talking with Thomas Gleixner, we agreed that we are changing the
>   name to: time_shutdown_sync() and timer_shutdown() (no wait version).
>   I'll be posting new patches soon. ]
>
> -- Steve
>
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 22f8bab583dd..0ef58697d4e5 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -2439,6 +2439,8 @@ void sk_stop_timer(struct sock *sk, struct timer_list *timer);
>
>  void sk_stop_timer_sync(struct sock *sk, struct timer_list *timer);
>
> +void sk_shutdown_timer(struct sock *sk, struct timer_list *timer);
> +
>  int __sk_queue_drop_skb(struct sock *sk, struct sk_buff_head *sk_queue,
>                         struct sk_buff *skb, unsigned int flags,
>                         void (*destructor)(struct sock *sk,
> diff --git a/net/core/sock.c b/net/core/sock.c
> index a3ba0358c77c..82124862b594 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -3357,6 +3357,13 @@ void sk_stop_timer_sync(struct sock *sk, struct timer_list *timer)
>  }
>  EXPORT_SYMBOL(sk_stop_timer_sync);
>
> +void sk_shutdown_timer(struct sock *sk, struct timer_list* timer)
> +{
> +       if (timer_shutdown(timer))
> +               __sock_put(sk);
> +}
> +EXPORT_SYMBOL(sk_shutdown_timer);
> +
>  void sock_init_data(struct socket *sock, struct sock *sk)
>  {
>         sk_init_common(sk);
> diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
> index 5e70228c5ae9..71f398f51958 100644
> --- a/net/ipv4/inet_connection_sock.c
> +++ b/net/ipv4/inet_connection_sock.c
> @@ -722,15 +722,15 @@ void inet_csk_clear_xmit_timers(struct sock *sk)
>
>         icsk->icsk_pending = icsk->icsk_ack.pending = 0;
>
> -       sk_stop_timer(sk, &icsk->icsk_retransmit_timer);
> -       sk_stop_timer(sk, &icsk->icsk_delack_timer);
> -       sk_stop_timer(sk, &sk->sk_timer);
> +       sk_shutdown_timer(sk, &icsk->icsk_retransmit_timer);
> +       sk_shutdown_timer(sk, &icsk->icsk_delack_timer);
> +       sk_shutdown_timer(sk, &sk->sk_timer);
>  }
>  EXPORT_SYMBOL(inet_csk_clear_xmit_timers);

 inet_csk_clear_xmit_timers() can be called multiple times during TCP
socket lifetime.

(See tcp_disconnect(), which can be followed by another connect() ... and loop)

Maybe add a second parameter, or add a new
inet_csk_shutdown_xmit_timers() only called from tcp_v4_destroy_sock() ?

>
>  void inet_csk_delete_keepalive_timer(struct sock *sk)
>  {
> -       sk_stop_timer(sk, &sk->sk_timer);
> +       sk_shutdown_timer(sk, &sk->sk_timer);

SO_KEEPALIVE can be called multiple times in a TCP socket lifetime,
on/off/on/off/...

I suggest leaving sk_stop_timer() here.

Eventually  inet_csk_clear_xmit_timers( sk, destroy=true) (or
inet_csk_shutdown_xmit_timers(())
   will  be called before the socket is destroyed.
Steven Rostedt Nov. 4, 2022, 5:51 a.m. UTC | #15
On Thu, 3 Nov 2022 17:00:20 -0700
Eric Dumazet <edumazet@google.com> wrote:

>  inet_csk_clear_xmit_timers() can be called multiple times during TCP
> socket lifetime.
> 
> (See tcp_disconnect(), which can be followed by another connect() ... and loop)
> 
> Maybe add a second parameter, or add a new
> inet_csk_shutdown_xmit_timers() only called from tcp_v4_destroy_sock() ?
> 

I guess.

> >
> >  void inet_csk_delete_keepalive_timer(struct sock *sk)
> >  {
> > -       sk_stop_timer(sk, &sk->sk_timer);
> > +       sk_shutdown_timer(sk, &sk->sk_timer);  
> 
> SO_KEEPALIVE can be called multiple times in a TCP socket lifetime,
> on/off/on/off/...
> 
> I suggest leaving sk_stop_timer() here.
> 
> Eventually  inet_csk_clear_xmit_timers( sk, destroy=true) (or
> inet_csk_shutdown_xmit_timers(())
>    will  be called before the socket is destroyed.

OK. 

Guenter,

I posted a new series, but did not include this change. If you want to
test that other series, I would suggest to at least add the first part
of this patch, otherwise it will trigger. But we want to see if there's
other locations of issue that we should care about.

-- Steve
Guenter Roeck Nov. 4, 2022, 4:14 p.m. UTC | #16
On Fri, Nov 04, 2022 at 01:51:39AM -0400, Steven Rostedt wrote:
> On Thu, 3 Nov 2022 17:00:20 -0700
> Eric Dumazet <edumazet@google.com> wrote:
> 
> >  inet_csk_clear_xmit_timers() can be called multiple times during TCP
> > socket lifetime.
> > 
> > (See tcp_disconnect(), which can be followed by another connect() ... and loop)
> > 
> > Maybe add a second parameter, or add a new
> > inet_csk_shutdown_xmit_timers() only called from tcp_v4_destroy_sock() ?
> > 
> 
> I guess.
> 
> > >
> > >  void inet_csk_delete_keepalive_timer(struct sock *sk)
> > >  {
> > > -       sk_stop_timer(sk, &sk->sk_timer);
> > > +       sk_shutdown_timer(sk, &sk->sk_timer);  
> > 
> > SO_KEEPALIVE can be called multiple times in a TCP socket lifetime,
> > on/off/on/off/...
> > 
> > I suggest leaving sk_stop_timer() here.
> > 
> > Eventually  inet_csk_clear_xmit_timers( sk, destroy=true) (or
> > inet_csk_shutdown_xmit_timers(())
> >    will  be called before the socket is destroyed.
> 
> OK. 
> 
> Guenter,
> 
> I posted a new series, but did not include this change. If you want to
> test that other series, I would suggest to at least add the first part
> of this patch, otherwise it will trigger. But we want to see if there's
> other locations of issue that we should care about.
> 

I'll run a test on the other series without change first. We'll see what
happens. If necessary I'll add [parts of] this patch and re-test, but
before doing that I would like to get a sense for the status of your
series as-is.

Thanks,
Guenter
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 2c07fa8ecfc8..81e9f232ca69 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -15528,7 +15528,7 @@  static int i40e_init_recovery_mode(struct i40e_pf *pf, struct i40e_hw *hw)
 
 err_switch_setup:
 	i40e_reset_interrupt_capability(pf);
-	del_timer_sync(&pf->service_timer);
+	del_timer_shutdown(&pf->service_timer);
 	i40e_shutdown_adminq(hw);
 	iounmap(hw->hw_addr);
 	pci_disable_pcie_error_reporting(pf->pdev);
@@ -16147,7 +16147,7 @@  static int i40e_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	kfree(pf->vsi);
 err_switch_setup:
 	i40e_reset_interrupt_capability(pf);
-	del_timer_sync(&pf->service_timer);
+	del_timer_shutdown(&pf->service_timer);
 err_mac_addr:
 err_configure_lan_hmc:
 	(void)i40e_shutdown_lan_hmc(hw);
@@ -16209,7 +16209,7 @@  static void i40e_remove(struct pci_dev *pdev)
 	set_bit(__I40E_SUSPENDED, pf->state);
 	set_bit(__I40E_DOWN, pf->state);
 	if (pf->service_timer.function)
-		del_timer_sync(&pf->service_timer);
+		del_timer_shutdown(&pf->service_timer);
 	if (pf->service_task.func)
 		cancel_work_sync(&pf->service_task);
 
diff --git a/drivers/net/ethernet/marvell/sky2.c b/drivers/net/ethernet/marvell/sky2.c
index ab33ba1c3023..9d8a9ae64681 100644
--- a/drivers/net/ethernet/marvell/sky2.c
+++ b/drivers/net/ethernet/marvell/sky2.c
@@ -5013,7 +5013,7 @@  static void sky2_remove(struct pci_dev *pdev)
 	if (!hw)
 		return;
 
-	del_timer_sync(&hw->watchdog_timer);
+	del_timer_shutdown(&hw->watchdog_timer);
 	cancel_work_sync(&hw->restart_work);
 
 	for (i = hw->ports-1; i >= 0; --i)
diff --git a/drivers/net/ethernet/sun/sunvnet.c b/drivers/net/ethernet/sun/sunvnet.c
index acda6cbd0238..f008812356ef 100644
--- a/drivers/net/ethernet/sun/sunvnet.c
+++ b/drivers/net/ethernet/sun/sunvnet.c
@@ -524,7 +524,7 @@  static void vnet_port_remove(struct vio_dev *vdev)
 		hlist_del_rcu(&port->hash);
 
 		synchronize_rcu();
-		del_timer_sync(&port->clean_timer);
+		del_timer_shutdown(&port->clean_timer);
 		sunvnet_port_rm_txq_common(port);
 		netif_napi_del(&port->napi);
 		sunvnet_port_free_tx_bufs_common(port);
diff --git a/drivers/net/usb/sierra_net.c b/drivers/net/usb/sierra_net.c
index b3ae949e6f1c..75d4956fc1e6 100644
--- a/drivers/net/usb/sierra_net.c
+++ b/drivers/net/usb/sierra_net.c
@@ -759,7 +759,7 @@  static void sierra_net_unbind(struct usbnet *dev, struct usb_interface *intf)
 	dev_dbg(&dev->udev->dev, "%s", __func__);
 
 	/* kill the timer and work */
-	del_timer_sync(&priv->sync_timer);
+	del_timer_shutdown(&priv->sync_timer);
 	cancel_work_sync(&priv->sierra_net_kevent);
 
 	/* tell modem we are going away */
diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-dbg-tlv.c b/drivers/net/wireless/intel/iwlwifi/iwl-dbg-tlv.c
index 3237d4b528b5..dced4d0384c7 100644
--- a/drivers/net/wireless/intel/iwlwifi/iwl-dbg-tlv.c
+++ b/drivers/net/wireless/intel/iwlwifi/iwl-dbg-tlv.c
@@ -371,7 +371,7 @@  void iwl_dbg_tlv_del_timers(struct iwl_trans *trans)
 	struct iwl_dbg_tlv_timer_node *node, *tmp;
 
 	list_for_each_entry_safe(node, tmp, timer_list, list) {
-		del_timer_sync(&node->timer);
+		del_timer_shutdown(&node->timer);
 		list_del(&node->list);
 		kfree(node);
 	}
diff --git a/drivers/net/wireless/intersil/hostap/hostap_ap.c b/drivers/net/wireless/intersil/hostap/hostap_ap.c
index 462ccc7d7d1a..34236d793b80 100644
--- a/drivers/net/wireless/intersil/hostap/hostap_ap.c
+++ b/drivers/net/wireless/intersil/hostap/hostap_ap.c
@@ -135,7 +135,7 @@  static void ap_free_sta(struct ap_data *ap, struct sta_info *sta)
 
 	if (!sta->ap)
 		kfree(sta->u.sta.challenge);
-	del_timer_sync(&sta->timer);
+	del_timer_shutdown(&sta->timer);
 #endif /* PRISM2_NO_KERNEL_IEEE80211_MGMT */
 
 	kfree(sta);
diff --git a/drivers/net/wireless/marvell/mwifiex/main.c b/drivers/net/wireless/marvell/mwifiex/main.c
index da2e6557e684..8fd4d603fe37 100644
--- a/drivers/net/wireless/marvell/mwifiex/main.c
+++ b/drivers/net/wireless/marvell/mwifiex/main.c
@@ -123,7 +123,7 @@  static int mwifiex_unregister(struct mwifiex_adapter *adapter)
 	if (adapter->if_ops.cleanup_if)
 		adapter->if_ops.cleanup_if(adapter);
 
-	del_timer_sync(&adapter->cmd_timer);
+	del_timer_shutdown(&adapter->cmd_timer);
 
 	/* Free private structures */
 	for (i = 0; i < adapter->priv_num; i++) {
diff --git a/drivers/net/wireless/microchip/wilc1000/hif.c b/drivers/net/wireless/microchip/wilc1000/hif.c
index eb1d1ba3a443..7a96f9828c97 100644
--- a/drivers/net/wireless/microchip/wilc1000/hif.c
+++ b/drivers/net/wireless/microchip/wilc1000/hif.c
@@ -1520,10 +1520,10 @@  int wilc_deinit(struct wilc_vif *vif)
 
 	mutex_lock(&vif->wilc->deinit_lock);
 
-	del_timer_sync(&hif_drv->scan_timer);
-	del_timer_sync(&hif_drv->connect_timer);
-	del_timer_sync(&vif->periodic_rssi);
-	del_timer_sync(&hif_drv->remain_on_ch_timer);
+	del_timer_shutdown(&hif_drv->scan_timer);
+	del_timer_shutdown(&hif_drv->connect_timer);
+	del_timer_shutdown(&vif->periodic_rssi);
+	del_timer_shutdown(&hif_drv->remain_on_ch_timer);
 
 	if (hif_drv->usr_scan_req.scan_result) {
 		hif_drv->usr_scan_req.scan_result(SCAN_EVENT_ABORTED, NULL,
diff --git a/net/802/garp.c b/net/802/garp.c
index fc9eb02a912f..610753f269ca 100644
--- a/net/802/garp.c
+++ b/net/802/garp.c
@@ -618,7 +618,7 @@  void garp_uninit_applicant(struct net_device *dev, struct garp_application *appl
 
 	/* Delete timer and generate a final TRANSMIT_PDU event to flush out
 	 * all pending messages before the applicant is gone. */
-	del_timer_sync(&app->join_timer);
+	del_timer_shutdown(&app->join_timer);
 
 	spin_lock_bh(&app->lock);
 	garp_gid_event(app, GARP_EVENT_TRANSMIT_PDU);
diff --git a/net/802/mrp.c b/net/802/mrp.c
index 155f74d8b14f..72d4680ce170 100644
--- a/net/802/mrp.c
+++ b/net/802/mrp.c
@@ -904,7 +904,7 @@  void mrp_uninit_applicant(struct net_device *dev, struct mrp_application *appl)
 	 * all pending messages before the applicant is gone.
 	 */
 	del_timer_sync(&app->join_timer);
-	del_timer_sync(&app->periodic_timer);
+	del_timer_shutdown(&app->periodic_timer);
 
 	spin_lock_bh(&app->lock);
 	mrp_mad_event(app, MRP_EVENT_TX);
diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index db4f2641d1cd..0724c45049e4 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -605,7 +605,7 @@  static void br_multicast_destroy_mdb_entry(struct net_bridge_mcast_gc *gc)
 	WARN_ON(!hlist_unhashed(&mp->mdb_node));
 	WARN_ON(mp->ports);
 
-	del_timer_sync(&mp->timer);
+	del_timer_shutdown(&mp->timer);
 	kfree_rcu(mp, rcu);
 }
 
@@ -646,7 +646,7 @@  static void br_multicast_destroy_group_src(struct net_bridge_mcast_gc *gc)
 	src = container_of(gc, struct net_bridge_group_src, mcast_gc);
 	WARN_ON(!hlist_unhashed(&src->node));
 
-	del_timer_sync(&src->timer);
+	del_timer_shutdown(&src->timer);
 	kfree_rcu(src, rcu);
 }
 
@@ -671,7 +671,7 @@  static void br_multicast_destroy_port_group(struct net_bridge_mcast_gc *gc)
 	WARN_ON(!hlist_empty(&pg->src_list));
 
 	del_timer_sync(&pg->rexmit_timer);
-	del_timer_sync(&pg->timer);
+	del_timer_shutdown(&pg->timer);
 	kfree_rcu(pg, rcu);
 }
 
diff --git a/net/bridge/br_multicast_eht.c b/net/bridge/br_multicast_eht.c
index f91c071d1608..78dcfba2b16c 100644
--- a/net/bridge/br_multicast_eht.c
+++ b/net/bridge/br_multicast_eht.c
@@ -142,7 +142,7 @@  static void br_multicast_destroy_eht_set_entry(struct net_bridge_mcast_gc *gc)
 	set_h = container_of(gc, struct net_bridge_group_eht_set_entry, mcast_gc);
 	WARN_ON(!RB_EMPTY_NODE(&set_h->rb_node));
 
-	del_timer_sync(&set_h->timer);
+	del_timer_shutdown(&set_h->timer);
 	kfree(set_h);
 }
 
@@ -154,7 +154,7 @@  static void br_multicast_destroy_eht_set(struct net_bridge_mcast_gc *gc)
 	WARN_ON(!RB_EMPTY_NODE(&eht_set->rb_node));
 	WARN_ON(!RB_EMPTY_ROOT(&eht_set->entry_tree));
 
-	del_timer_sync(&eht_set->timer);
+	del_timer_shutdown(&eht_set->timer);
 	kfree(eht_set);
 }
 
diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c
index 4fcbdd71c59f..834287d0675e 100644
--- a/net/core/gen_estimator.c
+++ b/net/core/gen_estimator.c
@@ -208,7 +208,7 @@  void gen_kill_estimator(struct net_rate_estimator __rcu **rate_est)
 
 	est = xchg((__force struct net_rate_estimator **)rate_est, NULL);
 	if (est) {
-		del_timer_sync(&est->timer);
+		del_timer_shutdown(&est->timer);
 		kfree_rcu(est, rcu);
 	}
 }
diff --git a/net/core/sock.c b/net/core/sock.c
index a3ba0358c77c..10cc84379d75 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3352,7 +3352,7 @@  EXPORT_SYMBOL(sk_stop_timer);
 
 void sk_stop_timer_sync(struct sock *sk, struct timer_list *timer)
 {
-	if (del_timer_sync(timer))
+	if (del_timer_shutdown(timer))
 		__sock_put(sk);
 }
 EXPORT_SYMBOL(sk_stop_timer_sync);
diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c
index 66fc940f9521..549a4c1990ea 100644
--- a/net/ipv4/inet_timewait_sock.c
+++ b/net/ipv4/inet_timewait_sock.c
@@ -208,7 +208,7 @@  EXPORT_SYMBOL_GPL(inet_twsk_alloc);
  */
 void inet_twsk_deschedule_put(struct inet_timewait_sock *tw)
 {
-	if (del_timer_sync(&tw->tw_timer))
+	if (del_timer_shutdown(&tw->tw_timer))
 		inet_twsk_kill(tw);
 	inet_twsk_put(tw);
 }
diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index e04544ac4b45..459a80325247 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -412,7 +412,7 @@  static struct mr_table *ipmr_new_table(struct net *net, u32 id)
 
 static void ipmr_free_table(struct mr_table *mrt)
 {
-	del_timer_sync(&mrt->ipmr_expire_timer);
+	del_timer_shutdown(&mrt->ipmr_expire_timer);
 	mroute_clean_tables(mrt, MRT_FLUSH_VIFS | MRT_FLUSH_VIFS_STATIC |
 				 MRT_FLUSH_MFC | MRT_FLUSH_MFC_STATIC);
 	rhltable_destroy(&mrt->mfc_hash);
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index facdc78a43e5..9bd993046ebe 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -392,7 +392,7 @@  static struct mr_table *ip6mr_new_table(struct net *net, u32 id)
 
 static void ip6mr_free_table(struct mr_table *mrt)
 {
-	del_timer_sync(&mrt->ipmr_expire_timer);
+	del_timer_shutdown(&mrt->ipmr_expire_timer);
 	mroute_clean_tables(mrt, MRT6_FLUSH_MIFS | MRT6_FLUSH_MIFS_STATIC |
 				 MRT6_FLUSH_MFC | MRT6_FLUSH_MFC_STATIC);
 	rhltable_destroy(&mrt->mfc_hash);
diff --git a/net/mac80211/mesh_pathtbl.c b/net/mac80211/mesh_pathtbl.c
index acc1c299f1ae..d4c7c67a4dee 100644
--- a/net/mac80211/mesh_pathtbl.c
+++ b/net/mac80211/mesh_pathtbl.c
@@ -512,7 +512,7 @@  static void mesh_path_free_rcu(struct mesh_table *tbl,
 	mpath->flags |= MESH_PATH_RESOLVING | MESH_PATH_DELETED;
 	mesh_gate_del(tbl, mpath);
 	spin_unlock_bh(&mpath->state_lock);
-	del_timer_sync(&mpath->timer);
+	del_timer_shutdown(&mpath->timer);
 	atomic_dec(&sdata->u.mesh.mpaths);
 	atomic_dec(&tbl->entries);
 	mesh_path_flush_pending(mpath);
diff --git a/net/netfilter/ipset/ip_set_list_set.c b/net/netfilter/ipset/ip_set_list_set.c
index 5a67f7966574..6a8b0e80385b 100644
--- a/net/netfilter/ipset/ip_set_list_set.c
+++ b/net/netfilter/ipset/ip_set_list_set.c
@@ -427,7 +427,7 @@  list_set_destroy(struct ip_set *set)
 	struct set_elem *e, *n;
 
 	if (SET_WITH_TIMEOUT(set))
-		del_timer_sync(&map->gc);
+		del_timer_shutdown(&map->gc);
 
 	list_for_each_entry_safe(e, n, &map->members, list) {
 		list_del(&e->list);
diff --git a/net/netfilter/ipvs/ip_vs_lblc.c b/net/netfilter/ipvs/ip_vs_lblc.c
index 7ac7473e3804..1f08ba927d0e 100644
--- a/net/netfilter/ipvs/ip_vs_lblc.c
+++ b/net/netfilter/ipvs/ip_vs_lblc.c
@@ -384,7 +384,7 @@  static void ip_vs_lblc_done_svc(struct ip_vs_service *svc)
 	struct ip_vs_lblc_table *tbl = svc->sched_data;
 
 	/* remove periodic timer */
-	del_timer_sync(&tbl->periodic_timer);
+	del_timer_shutdown(&tbl->periodic_timer);
 
 	/* got to clean up table entries here */
 	ip_vs_lblc_flush(svc);
diff --git a/net/netfilter/ipvs/ip_vs_lblcr.c b/net/netfilter/ipvs/ip_vs_lblcr.c
index 77c323c36a88..f939a00826d6 100644
--- a/net/netfilter/ipvs/ip_vs_lblcr.c
+++ b/net/netfilter/ipvs/ip_vs_lblcr.c
@@ -547,7 +547,7 @@  static void ip_vs_lblcr_done_svc(struct ip_vs_service *svc)
 	struct ip_vs_lblcr_table *tbl = svc->sched_data;
 
 	/* remove periodic timer */
-	del_timer_sync(&tbl->periodic_timer);
+	del_timer_shutdown(&tbl->periodic_timer);
 
 	/* got to clean up table entries here */
 	ip_vs_lblcr_flush(svc);
diff --git a/net/netfilter/xt_LED.c b/net/netfilter/xt_LED.c
index 0371c387b0d1..0093fa1d07c6 100644
--- a/net/netfilter/xt_LED.c
+++ b/net/netfilter/xt_LED.c
@@ -166,7 +166,7 @@  static void led_tg_destroy(const struct xt_tgdtor_param *par)
 
 	list_del(&ledinternal->list);
 
-	del_timer_sync(&ledinternal->timer);
+	del_timer_shutdown(&ledinternal->timer);
 
 	led_trigger_unregister(&ledinternal->netfilter_led_trigger);
 
diff --git a/net/rxrpc/conn_object.c b/net/rxrpc/conn_object.c
index 22089e37e97f..3f353f1f38ee 100644
--- a/net/rxrpc/conn_object.c
+++ b/net/rxrpc/conn_object.c
@@ -358,7 +358,7 @@  static void rxrpc_destroy_connection(struct rcu_head *rcu)
 
 	_net("DESTROY CONN %d", conn->debug_id);
 
-	del_timer_sync(&conn->timer);
+	del_timer_shutdown(&conn->timer);
 	rxrpc_purge_queue(&conn->rx_queue);
 
 	conn->security->clear(conn);
diff --git a/net/sched/cls_flow.c b/net/sched/cls_flow.c
index 014cd3de7b5d..b23fbd2d4b5a 100644
--- a/net/sched/cls_flow.c
+++ b/net/sched/cls_flow.c
@@ -367,7 +367,7 @@  static const struct nla_policy flow_policy[TCA_FLOW_MAX + 1] = {
 
 static void __flow_destroy_filter(struct flow_filter *f)
 {
-	del_timer_sync(&f->perturb_timer);
+	del_timer_shutdown(&f->perturb_timer);
 	tcf_exts_destroy(&f->exts);
 	tcf_em_tree_destroy(&f->ematches);
 	tcf_exts_put_net(&f->exts);
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index 149171774bc6..b07bc9f9b3bd 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -567,7 +567,7 @@  svc_destroy(struct kref *ref)
 	struct svc_serv *serv = container_of(ref, struct svc_serv, sv_refcnt);
 
 	dprintk("svc: svc_destroy(%s)\n", serv->sv_program->pg_name);
-	del_timer_sync(&serv->sv_temptimer);
+	del_timer_shutdown(&serv->sv_temptimer);
 
 	/*
 	 * The last user is gone and thus all sockets have to be destroyed to
diff --git a/net/tipc/discover.c b/net/tipc/discover.c
index da69e1abf68f..09d69670506e 100644
--- a/net/tipc/discover.c
+++ b/net/tipc/discover.c
@@ -385,7 +385,7 @@  int tipc_disc_create(struct net *net, struct tipc_bearer *b,
  */
 void tipc_disc_delete(struct tipc_discoverer *d)
 {
-	del_timer_sync(&d->timer);
+	del_timer_shutdown(&d->timer);
 	kfree_skb(d->skb);
 	kfree(d);
 }
diff --git a/net/tipc/monitor.c b/net/tipc/monitor.c
index 9618e4429f0f..cedc4a468315 100644
--- a/net/tipc/monitor.c
+++ b/net/tipc/monitor.c
@@ -700,7 +700,7 @@  void tipc_mon_delete(struct net *net, int bearer_id)
 	}
 	mon->self = NULL;
 	write_unlock_bh(&mon->lock);
-	del_timer_sync(&mon->timer);
+	del_timer_shutdown(&mon->timer);
 	kfree(self->domain);
 	kfree(self);
 	kfree(mon);