Message ID | 20210806082124.96607-1-wangkefeng.wang@huawei.com (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [v2] once: Fix panic when module unload | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
On Fri, 6 Aug 2021 16:21:24 +0800 Kefeng Wang wrote: > DO_ONCE > DEFINE_STATIC_KEY_TRUE(___once_key); > __do_once_done > once_disable_jump(once_key); > INIT_WORK(&w->work, once_deferred); > struct once_work *w; > w->key = key; > schedule_work(&w->work); module unload > //*the key is > destroy* > process_one_work > once_deferred > BUG_ON(!static_key_enabled(work->key)); > static_key_count((struct static_key *)x) //*access key, crash* > > When module uses DO_ONCE mechanism, it could crash due to the above > concurrency problem, we could reproduce it with link[1]. > > Fix it by add/put module refcount in the once work process. > > [1] https://lore.kernel.org/netdev/eaa6c371-465e-57eb-6be9-f4b16b9d7cbf@huawei.com/ Seems reasonable. Greg does it look good to you? I think we can take it thru networking since nobody cared to pick up v1.
On Fri, Aug 06, 2021 at 06:43:28AM -0700, Jakub Kicinski wrote: > On Fri, 6 Aug 2021 16:21:24 +0800 Kefeng Wang wrote: > > DO_ONCE > > DEFINE_STATIC_KEY_TRUE(___once_key); > > __do_once_done > > once_disable_jump(once_key); > > INIT_WORK(&w->work, once_deferred); > > struct once_work *w; > > w->key = key; > > schedule_work(&w->work); module unload > > //*the key is > > destroy* > > process_one_work > > once_deferred > > BUG_ON(!static_key_enabled(work->key)); > > static_key_count((struct static_key *)x) //*access key, crash* > > > > When module uses DO_ONCE mechanism, it could crash due to the above > > concurrency problem, we could reproduce it with link[1]. > > > > Fix it by add/put module refcount in the once work process. > > > > [1] https://lore.kernel.org/netdev/eaa6c371-465e-57eb-6be9-f4b16b9d7cbf@huawei.com/ > > Seems reasonable. Greg does it look good to you? Me? I was not paying attention to this at all, sorry... > I think we can take it thru networking since nobody cared to pick up v1. Sure, I trust you :)
On Fri, Aug 6, 2021, at 10:21, Kefeng Wang wrote: > DO_ONCE > DEFINE_STATIC_KEY_TRUE(___once_key); > __do_once_done > once_disable_jump(once_key); > INIT_WORK(&w->work, once_deferred); > struct once_work *w; > w->key = key; > schedule_work(&w->work); module unload > //*the key is > destroy* > process_one_work > once_deferred > BUG_ON(!static_key_enabled(work->key)); > static_key_count((struct static_key *)x) //*access key, crash* > > When module uses DO_ONCE mechanism, it could crash due to the above > concurrency problem, we could reproduce it with link[1]. > > Fix it by add/put module refcount in the once work process. > > [1] > https://lore.kernel.org/netdev/eaa6c371-465e-57eb-6be9-f4b16b9d7cbf@huawei.com/ > > Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> > Cc: Daniel Borkmann <daniel@iogearbox.net> > Cc: David S. Miller <davem@davemloft.net> > Cc: Eric Dumazet <edumazet@google.com> > Reported-by: Minmin chen <chenmingmin@huawei.com> > Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Thanks, Hannes
On 2021/8/6 23:22, Hannes Frederic Sowa wrote: > On Fri, Aug 6, 2021, at 10:21, Kefeng Wang wrote: >> DO_ONCE >> DEFINE_STATIC_KEY_TRUE(___once_key); >> __do_once_done >> once_disable_jump(once_key); >> INIT_WORK(&w->work, once_deferred); >> struct once_work *w; >> w->key = key; >> schedule_work(&w->work); module unload >> //*the key is >> destroy* >> process_one_work >> once_deferred >> BUG_ON(!static_key_enabled(work->key)); >> static_key_count((struct static_key *)x) //*access key, crash* >> >> When module uses DO_ONCE mechanism, it could crash due to the above >> concurrency problem, we could reproduce it with link[1]. >> >> Fix it by add/put module refcount in the once work process. >> >> [1] >> https://lore.kernel.org/netdev/eaa6c371-465e-57eb-6be9-f4b16b9d7cbf@huawei.com/ >> >> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> >> Cc: Daniel Borkmann <daniel@iogearbox.net> >> Cc: David S. Miller <davem@davemloft.net> >> Cc: Eric Dumazet <edumazet@google.com> >> Reported-by: Minmin chen <chenmingmin@huawei.com> >> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> > Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Thanks. > > Thanks, > Hannes > . >
On 2021/8/6 21:51, Greg KH wrote: > On Fri, Aug 06, 2021 at 06:43:28AM -0700, Jakub Kicinski wrote: >> On Fri, 6 Aug 2021 16:21:24 +0800 Kefeng Wang wrote: >>> DO_ONCE >>> DEFINE_STATIC_KEY_TRUE(___once_key); >>> __do_once_done >>> once_disable_jump(once_key); >>> INIT_WORK(&w->work, once_deferred); >>> struct once_work *w; >>> w->key = key; >>> schedule_work(&w->work); module unload >>> //*the key is >>> destroy* >>> process_one_work >>> once_deferred >>> BUG_ON(!static_key_enabled(work->key)); >>> static_key_count((struct static_key *)x) //*access key, crash* >>> >>> When module uses DO_ONCE mechanism, it could crash due to the above >>> concurrency problem, we could reproduce it with link[1]. >>> >>> Fix it by add/put module refcount in the once work process. >>> >>> [1] https://lore.kernel.org/netdev/eaa6c371-465e-57eb-6be9-f4b16b9d7cbf@huawei.com/ >> Seems reasonable. Greg does it look good to you? > Me? I was not paying attention to this at all, sorry... > >> I think we can take it thru networking since nobody cared to pick up v1. Thanks all ;) > Sure, I trust you :) > . >
diff --git a/include/linux/once.h b/include/linux/once.h index 9225ee6d96c7..ae6f4eb41cbe 100644 --- a/include/linux/once.h +++ b/include/linux/once.h @@ -7,7 +7,7 @@ bool __do_once_start(bool *done, unsigned long *flags); void __do_once_done(bool *done, struct static_key_true *once_key, - unsigned long *flags); + unsigned long *flags, struct module *mod); /* Call a function exactly once. The idea of DO_ONCE() is to perform * a function call such as initialization of random seeds, etc, only @@ -46,7 +46,7 @@ void __do_once_done(bool *done, struct static_key_true *once_key, if (unlikely(___ret)) { \ func(__VA_ARGS__); \ __do_once_done(&___done, &___once_key, \ - &___flags); \ + &___flags, THIS_MODULE); \ } \ } \ ___ret; \ diff --git a/lib/once.c b/lib/once.c index 8b7d6235217e..59149bf3bfb4 100644 --- a/lib/once.c +++ b/lib/once.c @@ -3,10 +3,12 @@ #include <linux/spinlock.h> #include <linux/once.h> #include <linux/random.h> +#include <linux/module.h> struct once_work { struct work_struct work; struct static_key_true *key; + struct module *module; }; static void once_deferred(struct work_struct *w) @@ -16,10 +18,11 @@ static void once_deferred(struct work_struct *w) work = container_of(w, struct once_work, work); BUG_ON(!static_key_enabled(work->key)); static_branch_disable(work->key); + module_put(work->module); kfree(work); } -static void once_disable_jump(struct static_key_true *key) +static void once_disable_jump(struct static_key_true *key, struct module *mod) { struct once_work *w; @@ -29,6 +32,8 @@ static void once_disable_jump(struct static_key_true *key) INIT_WORK(&w->work, once_deferred); w->key = key; + w->module = mod; + __module_get(mod); schedule_work(&w->work); } @@ -53,11 +58,11 @@ bool __do_once_start(bool *done, unsigned long *flags) EXPORT_SYMBOL(__do_once_start); void __do_once_done(bool *done, struct static_key_true *once_key, - unsigned long *flags) + unsigned long *flags, struct module *mod) __releases(once_lock) { *done = true; spin_unlock_irqrestore(&once_lock, *flags); - once_disable_jump(once_key); + once_disable_jump(once_key, mod); } EXPORT_SYMBOL(__do_once_done);
DO_ONCE DEFINE_STATIC_KEY_TRUE(___once_key); __do_once_done once_disable_jump(once_key); INIT_WORK(&w->work, once_deferred); struct once_work *w; w->key = key; schedule_work(&w->work); module unload //*the key is destroy* process_one_work once_deferred BUG_ON(!static_key_enabled(work->key)); static_key_count((struct static_key *)x) //*access key, crash* When module uses DO_ONCE mechanism, it could crash due to the above concurrency problem, we could reproduce it with link[1]. Fix it by add/put module refcount in the once work process. [1] https://lore.kernel.org/netdev/eaa6c371-465e-57eb-6be9-f4b16b9d7cbf@huawei.com/ Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Reported-by: Minmin chen <chenmingmin@huawei.com> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> --- v2: always pass THIS_MODULE to macro DO_ONCE(), suggested by hannes include/linux/once.h | 4 ++-- lib/once.c | 11 ++++++++--- 2 files changed, 10 insertions(+), 5 deletions(-)