Message ID | 20221115195802.415956561@linutronix.de (mailing list archive) |
---|---|
Headers | show |
Series | timers: Provide timer_shutdown[_sync]() | expand |
On Tue, Nov 15, 2022 at 09:28:32PM +0100, Thomas Gleixner wrote: > Tearing down timers can be tedious when there are circular dependencies to > other things which need to be torn down. A prime example is timer and > workqueue where the timer schedules work and the work arms the timer. > > Steven and the Google Chromebook team ran into such an issue in the > Bluetooth HCI code. > > Steven suggested to create a new function del_timer_free() which marks the > timer as shutdown. Rearm attempts of shutdown timers are discarded and he > wanted to emit a warning for that case: > > https://lore.kernel.org/all/20220407161745.7d6754b3@gandalf.local.home > > This resulted in a lengthy discussion and suggestions how this should be > implemented. The patch series went through several iterations and during > the review of the last version it turned out that this approach is > suboptimal: > > https://lore.kernel.org/all/20221110064101.429013735@goodmis.org > > The warning is not really helpful because it's entirely unclear how it > should be acted upon. The only way to address such a case is to add 'if > (in_shutdown)' conditionals all over the place. This is error prone and in > most cases of teardown like the HCI one which started this discussion not > required all. > > What needs to prevented is that pending work which is drained via > destroy_workqueue() does not rearm the previously shutdown timer. Nothing > in that shutdown sequence relies on the timer being functional. > > The conclusion was that the semantics of timer_shutdown_sync() should be: > > - timer is not enqueued > - timer callback is not running > - timer cannot be rearmed > > Preventing the rearming of shutdown timers is done by discarding rearm > attempts silently. > > As Steven is short of cycles, I made some spare cycles available and > reworked the patch series to follow the new semantics and plugged the races > which were discovered during review. > > The patches have been split up into small pieces to make review easier and > I took the liberty to throw a bunch of overdue cleanups into the picture > instead of proliferating the existing state further. > > The last patch in the series addresses the HCI teardown issue for real. > I applied the series to the top of v6.1-rc5, and also applied the result of running the coccinelle script to auto-convert simple cases. Running this set of patches through my testbed showed no build errors, runtime failures, or warnings. I also backported the series to chromeos-5.15, again applied the coccinelle generated patches, and ran it through a regression test. No failures either. With that, for the series, Tested-by: Guenter Roeck <linux@roeck-us.net> Let me know if I should send individual tags for each patch in the series. Thanks, Guenter
On Tue, Nov 15 2022 at 21:28, Thomas Gleixner wrote: > > As Steven is short of cycles, I made some spare cycles available and > reworked the patch series to follow the new semantics and plugged the races > which were discovered during review. Any opinions on this pile or should I just declare it's perfect and queue it for 6.2? Thanks, tglx
On Mon, 21 Nov 2022 16:15:00 +0100 Thomas Gleixner <tglx@linutronix.de> wrote: > > As Steven is short of cycles, I made some spare cycles available and > > reworked the patch series to follow the new semantics and plugged the races > > which were discovered during review. > > Any opinions on this pile or should I just declare it's perfect and > queue it for 6.2? I have time cut out of today to go over it. Thanks Thomas, -- Steve
On Tue, 15 Nov 2022 21:28:32 +0100 (CET) Thomas Gleixner <tglx@linutronix.de> wrote: > The patches have been split up into small pieces to make review easier and > I took the liberty to throw a bunch of overdue cleanups into the picture > instead of proliferating the existing state further. After applying all these patches, and then my updates to the rest of the kernel, as well as my update to the debug objects to require shutdown. It reported this was needed: -- Steve diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 0fbb71950ca2..3e84a2621913 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -2188,7 +2188,7 @@ signed long __sched schedule_timeout(signed long timeout) timer_setup_on_stack(&timer.timer, process_timeout, 0); __mod_timer(&timer.timer, expire, MOD_TIMER_NOTPENDING); schedule(); - del_timer_sync(&timer.timer); + timer_shutdown_sync(&timer.timer); /* Remove the timer from the object tracker */ destroy_timer_on_stack(&timer.timer);