Message ID | 20240618213421.282381-1-shakeel.butt@linux.dev (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: ratelimit oversized kvmalloc warnings instead of once | expand |
On Tue, 18 Jun 2024 at 14:34, Shakeel Butt <shakeel.butt@linux.dev> wrote: > > Simply replace WARN_ON_ONCE with WARN_RATELIMIT. NAK. Sadly, the RATELIMIT cases are useless. The normal rate limiting is basically "burst of up to ten, every five seconds". That's going to completely swamp things and hide any other issue. If we ratelimit it to "at most 1 per hour", maybe something like that would be acceptable. But honestly, I do not understand your "first abuser only" complaint. There should not be *any* abusers. So just fix that first one already. If you have more than one, you have bigger issues. So what is the real reason for this broken patch? Why didn't you fix the first one? Linus
On Tue, Jun 18, 2024 at 02:40:10PM GMT, Linus Torvalds wrote: > On Tue, 18 Jun 2024 at 14:34, Shakeel Butt <shakeel.butt@linux.dev> wrote: > > > > Simply replace WARN_ON_ONCE with WARN_RATELIMIT. > > NAK. > > Sadly, the RATELIMIT cases are useless. > > The normal rate limiting is basically "burst of up to ten, every five seconds". > > That's going to completely swamp things and hide any other issue. > > If we ratelimit it to "at most 1 per hour", maybe something like that > would be acceptable. "at most 1 per hour" sounds good. > > But honestly, I do not understand your "first abuser only" complaint. > There should not be *any* abusers. So just fix that first one already. > > If you have more than one, you have bigger issues. So what is the real > reason for this broken patch? Why didn't you fix the first one? > The issue is the turnaround time to fix the first abuser and deploy the fixed kernel to big enough fleet to find the remaining abusers. Usually this turnaround time is in months.
On Tue 18-06-24 14:34:21, Shakeel Butt wrote: > At the moment oversize kvmalloc warnings are triggered once using > WARN_ON_ONCE() macro. One issue with this approach is that it only > detects the first abuser and then ignores the remaining abusers which > complicates detecting all such abusers in a timely manner. The situation > becomes worse when the repro has low probability and requires production > traffic and thus require large set of machines to find such abusers. In > Mera production, this warn once is slowing down the detection of these > abusers. Simply replace WARN_ON_ONCE with WARN_RATELIMIT. Long time ago, I've had a patch to do the once_per_callsite WARN. I cannot find reference at the moment but it used stack depot to note stacks that have already triggered. Back then there was no reponse on the ML. Should I try to dig deep and recover it from my archives? I think this is exactly kind of usecase where it would fit. > Reported-by: Kyle McMartin <kyle@infradead.org> > Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev> > --- > mm/util.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/mm/util.c b/mm/util.c > index 10f215985fe5..de36344e8d53 100644 > --- a/mm/util.c > +++ b/mm/util.c > @@ -649,7 +649,8 @@ void *kvmalloc_node_noprof(size_t size, gfp_t flags, int node) > > /* Don't even allow crazy sizes */ > if (unlikely(size > INT_MAX)) { > - WARN_ON_ONCE(!(flags & __GFP_NOWARN)); > + WARN_RATELIMIT(!(flags & __GFP_NOWARN), "size = %zu > INT_MAX", > + size); > return NULL; > } > > -- > 2.43.0
On Wed, Jun 19, 2024 at 09:19:41AM GMT, Michal Hocko wrote: > On Tue 18-06-24 14:34:21, Shakeel Butt wrote: > > At the moment oversize kvmalloc warnings are triggered once using > > WARN_ON_ONCE() macro. One issue with this approach is that it only > > detects the first abuser and then ignores the remaining abusers which > > complicates detecting all such abusers in a timely manner. The situation > > becomes worse when the repro has low probability and requires production > > traffic and thus require large set of machines to find such abusers. In > > Mera production, this warn once is slowing down the detection of these > > abusers. Simply replace WARN_ON_ONCE with WARN_RATELIMIT. > > Long time ago, I've had a patch to do the once_per_callsite WARN. I > cannot find reference at the moment but it used stack depot to note > stacks that have already triggered. Back then there was no reponse on > the ML. Should I try to dig deep and recover it from my archives? I > think this is exactly kind of usecase where it would fit. > Do you mean something like warn once per unique call stack? If yes then I think that is better than the simple ratelimiting version as ratelimiting one may still miss some abusers and also may keep warning about the same abuser. Please do share your patch. Thanks, Shakeel
On Wed 19-06-24 01:03:16, Shakeel Butt wrote: > On Wed, Jun 19, 2024 at 09:19:41AM GMT, Michal Hocko wrote: > > On Tue 18-06-24 14:34:21, Shakeel Butt wrote: > > > At the moment oversize kvmalloc warnings are triggered once using > > > WARN_ON_ONCE() macro. One issue with this approach is that it only > > > detects the first abuser and then ignores the remaining abusers which > > > complicates detecting all such abusers in a timely manner. The situation > > > becomes worse when the repro has low probability and requires production > > > traffic and thus require large set of machines to find such abusers. In > > > Mera production, this warn once is slowing down the detection of these > > > abusers. Simply replace WARN_ON_ONCE with WARN_RATELIMIT. > > > > Long time ago, I've had a patch to do the once_per_callsite WARN. I > > cannot find reference at the moment but it used stack depot to note > > stacks that have already triggered. Back then there was no reponse on > > the ML. Should I try to dig deep and recover it from my archives? I > > think this is exactly kind of usecase where it would fit. > > > > Do you mean something like warn once per unique call stack? Exactly! > If yes then > I think that is better than the simple ratelimiting version as > ratelimiting one may still miss some abusers and also may keep warning > about the same abuser. Please do share your patch. https://lore.kernel.org/all/20170103134424.28123-1-mhocko@kernel.org/
On Wed, Jun 19, 2024 at 10:30:46AM GMT, Michal Hocko wrote: > On Wed 19-06-24 01:03:16, Shakeel Butt wrote: > > On Wed, Jun 19, 2024 at 09:19:41AM GMT, Michal Hocko wrote: > > > On Tue 18-06-24 14:34:21, Shakeel Butt wrote: > > > > At the moment oversize kvmalloc warnings are triggered once using > > > > WARN_ON_ONCE() macro. One issue with this approach is that it only > > > > detects the first abuser and then ignores the remaining abusers which > > > > complicates detecting all such abusers in a timely manner. The situation > > > > becomes worse when the repro has low probability and requires production > > > > traffic and thus require large set of machines to find such abusers. In > > > > Mera production, this warn once is slowing down the detection of these > > > > abusers. Simply replace WARN_ON_ONCE with WARN_RATELIMIT. > > > > > > Long time ago, I've had a patch to do the once_per_callsite WARN. I > > > cannot find reference at the moment but it used stack depot to note > > > stacks that have already triggered. Back then there was no reponse on > > > the ML. Should I try to dig deep and recover it from my archives? I > > > think this is exactly kind of usecase where it would fit. > > > > > > > Do you mean something like warn once per unique call stack? > > Exactly! > > > If yes then > > I think that is better than the simple ratelimiting version as > > ratelimiting one may still miss some abusers and also may keep warning > > about the same abuser. Please do share your patch. > > https://lore.kernel.org/all/20170103134424.28123-1-mhocko@kernel.org/ Do you want to propose this patch again (after rebase to latest) or should I take over?
On Wed 19-06-24 10:30:46, Michal Hocko wrote: > On Wed 19-06-24 01:03:16, Shakeel Butt wrote: > > On Wed, Jun 19, 2024 at 09:19:41AM GMT, Michal Hocko wrote: > > > On Tue 18-06-24 14:34:21, Shakeel Butt wrote: > > > > At the moment oversize kvmalloc warnings are triggered once using > > > > WARN_ON_ONCE() macro. One issue with this approach is that it only > > > > detects the first abuser and then ignores the remaining abusers which > > > > complicates detecting all such abusers in a timely manner. The situation > > > > becomes worse when the repro has low probability and requires production > > > > traffic and thus require large set of machines to find such abusers. In > > > > Mera production, this warn once is slowing down the detection of these > > > > abusers. Simply replace WARN_ON_ONCE with WARN_RATELIMIT. > > > > > > Long time ago, I've had a patch to do the once_per_callsite WARN. I > > > cannot find reference at the moment but it used stack depot to note > > > stacks that have already triggered. Back then there was no reponse on > > > the ML. Should I try to dig deep and recover it from my archives? I > > > think this is exactly kind of usecase where it would fit. > > > > > > > Do you mean something like warn once per unique call stack? > > Exactly! > > > If yes then > > I think that is better than the simple ratelimiting version as > > ratelimiting one may still miss some abusers and also may keep warning > > about the same abuser. Please do share your patch. > > https://lore.kernel.org/all/20170103134424.28123-1-mhocko@kernel.org/ Btw. the code has changed a lot since 2017 when this was posted so it will likely need a lot of massaging to rebase. Also I am not entirely sure it is ok to change WARN_ONCE semantic like that anymore. Maybe we need an explicit variant that does this per-call-site warnings. It is a notable difference between library functions which can be called from different callpaths and those that are used from a single place. I do not have much time to dig deeper into this but if you want to take over then go ahead. I still think this is a useful WARN_ONCE or in general do_something_once semantic.
On Tue, Jun 18, 2024 at 02:34:21PM -0700, Shakeel Butt wrote: > At the moment oversize kvmalloc warnings are triggered once using > WARN_ON_ONCE() macro. One issue with this approach is that it only > detects the first abuser and then ignores the remaining abusers which > complicates detecting all such abusers in a timely manner. The situation > becomes worse when the repro has low probability and requires production > traffic and thus require large set of machines to find such abusers. In > Mera production, this warn once is slowing down the detection of these > abusers. Simply replace WARN_ON_ONCE with WARN_RATELIMIT. > > Reported-by: Kyle McMartin <kyle@infradead.org> > Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev> > --- > mm/util.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/mm/util.c b/mm/util.c > index 10f215985fe5..de36344e8d53 100644 > --- a/mm/util.c > +++ b/mm/util.c > @@ -649,7 +649,8 @@ void *kvmalloc_node_noprof(size_t size, gfp_t flags, int node) > > /* Don't even allow crazy sizes */ > if (unlikely(size > INT_MAX)) { > - WARN_ON_ONCE(!(flags & __GFP_NOWARN)); > + WARN_RATELIMIT(!(flags & __GFP_NOWARN), "size = %zu > INT_MAX", > + size); > return NULL; > } > I don't think this is necessary. From the description I think interested parties can get away with bpftrace. Suppose you have an abuser of the sort and you are worried there is more than one. Then this one-liner will catch *all* of them, not just the ones which were "lucky" to get logged with ratelimit: bpftrace -e 'kprobe:kvmalloc_node_noprof /arg0 > 2147483647/ { @[kstack()] = count(); }' Of course adding a probe is not free, but then again kvmalloc should not be used often to begin with so I doubt it is going to have material impact in terms of performance. While I concede it takes more effort to get this running on all affected machines, the result is much better than mere ratelimit. Also there is no need to patch the kernel. btw, I found drm keeps spamming kvmalloc, someone(tm) should look into it: @[ kvmalloc_node_noprof+5 drm_property_create_blob+76 drm_atomic_helper_dirtyfb+234 drm_fbdev_generic_helper_fb_dirty+509 drm_fb_helper_damage_work+139 process_one_work+376 worker_thread+753 kthread+207 ret_from_fork+49 ret_from_fork_asm+26 , 104]: 12
On Wed, Jun 19, 2024 at 2:49 PM Mateusz Guzik <mjguzik@gmail.com> wrote: > > On Tue, Jun 18, 2024 at 02:34:21PM -0700, Shakeel Butt wrote: > > At the moment oversize kvmalloc warnings are triggered once using > > WARN_ON_ONCE() macro. One issue with this approach is that it only > > detects the first abuser and then ignores the remaining abusers which > > complicates detecting all such abusers in a timely manner. The situation > > becomes worse when the repro has low probability and requires production > > traffic and thus require large set of machines to find such abusers. In > > Mera production, this warn once is slowing down the detection of these > > abusers. Simply replace WARN_ON_ONCE with WARN_RATELIMIT. > > > > Reported-by: Kyle McMartin <kyle@infradead.org> > > Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev> > > --- > > mm/util.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/mm/util.c b/mm/util.c > > index 10f215985fe5..de36344e8d53 100644 > > --- a/mm/util.c > > +++ b/mm/util.c > > @@ -649,7 +649,8 @@ void *kvmalloc_node_noprof(size_t size, gfp_t flags, int node) > > > > /* Don't even allow crazy sizes */ > > if (unlikely(size > INT_MAX)) { > > - WARN_ON_ONCE(!(flags & __GFP_NOWARN)); > > + WARN_RATELIMIT(!(flags & __GFP_NOWARN), "size = %zu > INT_MAX", > > + size); > > return NULL; > > } > > > > I don't think this is necessary. From the description I think interested > parties can get away with bpftrace. > > Suppose you have an abuser of the sort and you are worried there is more > than one. > > Then this one-liner will catch *all* of them, not just the ones which > were "lucky" to get logged with ratelimit: > bpftrace -e 'kprobe:kvmalloc_node_noprof /arg0 > 2147483647/ { @[kstack()] = count(); }' > > Of course adding a probe is not free, but then again kvmalloc should not > be used often to begin with so I doubt it is going to have material > impact in terms of performance. > > While I concede it takes more effort to get this running on all affected > machines, the result is much better than mere ratelimit. Also there is > no need to patch the kernel. > > btw, I found drm keeps spamming kvmalloc, someone(tm) should look into > it: > @[ > kvmalloc_node_noprof+5 > drm_property_create_blob+76 > drm_atomic_helper_dirtyfb+234 > drm_fbdev_generic_helper_fb_dirty+509 > drm_fb_helper_damage_work+139 > process_one_work+376 > worker_thread+753 > kthread+207 > ret_from_fork+49 > ret_from_fork_asm+26 > , 104]: 12 I should clarify this is allocs of 104 bytes, not some outlandish size.
On Wed, Jun 19, 2024 at 10:48:07AM +0200, Michal Hocko wrote: > On Wed 19-06-24 10:30:46, Michal Hocko wrote: > > On Wed 19-06-24 01:03:16, Shakeel Butt wrote: > > > On Wed, Jun 19, 2024 at 09:19:41AM GMT, Michal Hocko wrote: > > > > On Tue 18-06-24 14:34:21, Shakeel Butt wrote: > > > > > At the moment oversize kvmalloc warnings are triggered once using > > > > > WARN_ON_ONCE() macro. One issue with this approach is that it only > > > > > detects the first abuser and then ignores the remaining abusers which > > > > > complicates detecting all such abusers in a timely manner. The situation > > > > > becomes worse when the repro has low probability and requires production > > > > > traffic and thus require large set of machines to find such abusers. In > > > > > Mera production, this warn once is slowing down the detection of these > > > > > abusers. Simply replace WARN_ON_ONCE with WARN_RATELIMIT. > > > > > > > > Long time ago, I've had a patch to do the once_per_callsite WARN. I > > > > cannot find reference at the moment but it used stack depot to note > > > > stacks that have already triggered. Back then there was no reponse on > > > > the ML. Should I try to dig deep and recover it from my archives? I > > > > think this is exactly kind of usecase where it would fit. > > > > > > > > > > Do you mean something like warn once per unique call stack? > > > > Exactly! > > > > > If yes then > > > I think that is better than the simple ratelimiting version as > > > ratelimiting one may still miss some abusers and also may keep warning > > > about the same abuser. Please do share your patch. > > > > https://lore.kernel.org/all/20170103134424.28123-1-mhocko@kernel.org/ > > Btw. the code has changed a lot since 2017 when this was posted so it > will likely need a lot of massaging to rebase. Also I am not entirely > sure it is ok to change WARN_ONCE semantic like that anymore. Maybe we > need an explicit variant that does this per-call-site warnings. It is a > notable difference between library functions which can be called from > different callpaths and those that are used from a single place. I do > not have much time to dig deeper into this but if you want to take over > then go ahead. I still think this is a useful WARN_ONCE or in general > do_something_once semantic. I think a separate variant like WARN_UNIQUE() would be better. I will look into this. Linus, please let me know if you have any concerns on the approach Michal is suggesting i.e. a variant for warn once for unique call stack. Shakeel
On Wed, Jun 19, 2024 at 02:49:21PM +0200, Mateusz Guzik wrote: > On Tue, Jun 18, 2024 at 02:34:21PM -0700, Shakeel Butt wrote: > > At the moment oversize kvmalloc warnings are triggered once using > > WARN_ON_ONCE() macro. One issue with this approach is that it only > > detects the first abuser and then ignores the remaining abusers which > > complicates detecting all such abusers in a timely manner. The situation > > becomes worse when the repro has low probability and requires production > > traffic and thus require large set of machines to find such abusers. In > > Mera production, this warn once is slowing down the detection of these > > abusers. Simply replace WARN_ON_ONCE with WARN_RATELIMIT. > > > > Reported-by: Kyle McMartin <kyle@infradead.org> > > Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev> > > --- > > mm/util.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/mm/util.c b/mm/util.c > > index 10f215985fe5..de36344e8d53 100644 > > --- a/mm/util.c > > +++ b/mm/util.c > > @@ -649,7 +649,8 @@ void *kvmalloc_node_noprof(size_t size, gfp_t flags, int node) > > > > /* Don't even allow crazy sizes */ > > if (unlikely(size > INT_MAX)) { > > - WARN_ON_ONCE(!(flags & __GFP_NOWARN)); > > + WARN_RATELIMIT(!(flags & __GFP_NOWARN), "size = %zu > INT_MAX", > > + size); > > return NULL; > > } > > > > I don't think this is necessary. From the description I think interested > parties can get away with bpftrace. > > Suppose you have an abuser of the sort and you are worried there is more > than one. > > Then this one-liner will catch *all* of them, not just the ones which > were "lucky" to get logged with ratelimit: > bpftrace -e 'kprobe:kvmalloc_node_noprof /arg0 > 2147483647/ { @[kstack()] = count(); }' > > Of course adding a probe is not free, but then again kvmalloc should not > be used often to begin with so I doubt it is going to have material > impact in terms of performance. > > While I concede it takes more effort to get this running on all affected > machines, the result is much better than mere ratelimit. Also there is > no need to patch the kernel. > Thanks for the response and suggestion. I am inclined towards warn once for each unique stack trace as suggested by Michal because I think it would be useful in general.
On Wed, 19 Jun 2024 at 10:47, Shakeel Butt <shakeel.butt@linux.dev> wrote: > > Linus, please let me know if you have any concerns on the approach > Michal is suggesting i.e. a variant for warn once for unique call stack. I think we should just try to change the existing WARN_ONCE(), and see if it causes any issues. A new "WARN_UNIQUE()" might be the borign and safe approach, but (a) it won't actually be unique if you don't have stackdepot anyway, and will just be WARN_ONCE (b) I suspect most WARN_ONCE users really do want WARN_UNIQUE so let's at least _start_ with just changing semantics of the existing "once", and then if it causes problems we'll have to revisit this. I doubt it will cause problems, Linus
On Wed 19-06-24 12:30:42, Linus Torvalds wrote: > On Wed, 19 Jun 2024 at 10:47, Shakeel Butt <shakeel.butt@linux.dev> wrote: > > > > Linus, please let me know if you have any concerns on the approach > > Michal is suggesting i.e. a variant for warn once for unique call stack. > > I think we should just try to change the existing WARN_ONCE(), and see > if it causes any issues. > > A new "WARN_UNIQUE()" might be the borign and safe approach, but > > (a) it won't actually be unique if you don't have stackdepot anyway, > and will just be WARN_ONCE > > (b) I suspect most WARN_ONCE users really do want WARN_UNIQUE > > so let's at least _start_ with just changing semantics of the existing > "once", and then if it causes problems we'll have to revisit this. > > I doubt it will cause problems, I would be careful about the WARN_ONCE used from stackdepot itself. It's been some time since I have looked into that code but a quick grep tells there is some usage.
diff --git a/mm/util.c b/mm/util.c index 10f215985fe5..de36344e8d53 100644 --- a/mm/util.c +++ b/mm/util.c @@ -649,7 +649,8 @@ void *kvmalloc_node_noprof(size_t size, gfp_t flags, int node) /* Don't even allow crazy sizes */ if (unlikely(size > INT_MAX)) { - WARN_ON_ONCE(!(flags & __GFP_NOWARN)); + WARN_RATELIMIT(!(flags & __GFP_NOWARN), "size = %zu > INT_MAX", + size); return NULL; }
At the moment oversize kvmalloc warnings are triggered once using WARN_ON_ONCE() macro. One issue with this approach is that it only detects the first abuser and then ignores the remaining abusers which complicates detecting all such abusers in a timely manner. The situation becomes worse when the repro has low probability and requires production traffic and thus require large set of machines to find such abusers. In Mera production, this warn once is slowing down the detection of these abusers. Simply replace WARN_ON_ONCE with WARN_RATELIMIT. Reported-by: Kyle McMartin <kyle@infradead.org> Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev> --- mm/util.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)