Message ID | 20201201215900.3569844-1-guro@fb.com (mailing list archive) |
---|---|
Headers | show |
Series | bpf: switch to memcg-based memory accounting | expand |
Hello: This series was applied to bpf/bpf-next.git (refs/heads/master): On Tue, 1 Dec 2020 13:58:26 -0800 you wrote: > Currently bpf is using the memlock rlimit for the memory accounting. > This approach has its downsides and over time has created a significant > amount of problems: > > 1) The limit is per-user, but because most bpf operations are performed > as root, the limit has a little value. > > [...] Here is the summary with links: - [bpf-next,v9,01/34] mm: memcontrol: use helpers to read page's memcg data https://git.kernel.org/bpf/bpf-next/c/bcfe06bf2622 - [bpf-next,v9,02/34] mm: memcontrol/slab: use helpers to access slab page's memcg_data https://git.kernel.org/bpf/bpf-next/c/270c6a71460e - [bpf-next,v9,03/34] mm: introduce page memcg flags https://git.kernel.org/bpf/bpf-next/c/87944e2992bd - [bpf-next,v9,04/34] mm: convert page kmemcg type to a page memcg flag https://git.kernel.org/bpf/bpf-next/c/18b2db3b0385 - [bpf-next,v9,05/34] bpf: memcg-based memory accounting for bpf progs https://git.kernel.org/bpf/bpf-next/c/ddf8503c7c43 - [bpf-next,v9,06/34] bpf: prepare for memcg-based memory accounting for bpf maps https://git.kernel.org/bpf/bpf-next/c/48edc1f78aab - [bpf-next,v9,07/34] bpf: memcg-based memory accounting for bpf maps https://git.kernel.org/bpf/bpf-next/c/d5299b67dd59 - [bpf-next,v9,08/34] bpf: refine memcg-based memory accounting for arraymap maps https://git.kernel.org/bpf/bpf-next/c/6d192c7938b7 - [bpf-next,v9,09/34] bpf: refine memcg-based memory accounting for cpumap maps https://git.kernel.org/bpf/bpf-next/c/e88cc05b61f3 - [bpf-next,v9,10/34] bpf: memcg-based memory accounting for cgroup storage maps https://git.kernel.org/bpf/bpf-next/c/3a61c7c58b30 - [bpf-next,v9,11/34] bpf: refine memcg-based memory accounting for devmap maps https://git.kernel.org/bpf/bpf-next/c/1440290adf7b - [bpf-next,v9,12/34] bpf: refine memcg-based memory accounting for hashtab maps https://git.kernel.org/bpf/bpf-next/c/881456811a33 - [bpf-next,v9,13/34] bpf: memcg-based memory accounting for lpm_trie maps https://git.kernel.org/bpf/bpf-next/c/353e7af4bf5e - [bpf-next,v9,14/34] bpf: memcg-based memory accounting for bpf ringbuffer https://git.kernel.org/bpf/bpf-next/c/be4035c734d1 - [bpf-next,v9,15/34] bpf: memcg-based memory accounting for bpf local storage maps https://git.kernel.org/bpf/bpf-next/c/e9aae8beba82 - [bpf-next,v9,16/34] bpf: refine memcg-based memory accounting for sockmap and sockhash maps https://git.kernel.org/bpf/bpf-next/c/7846dd9f835e - [bpf-next,v9,17/34] bpf: refine memcg-based memory accounting for xskmap maps https://git.kernel.org/bpf/bpf-next/c/28e1dcdef0cb - [bpf-next,v9,18/34] bpf: eliminate rlimit-based memory accounting for arraymap maps https://git.kernel.org/bpf/bpf-next/c/1bc5975613ed - [bpf-next,v9,19/34] bpf: eliminate rlimit-based memory accounting for bpf_struct_ops maps https://git.kernel.org/bpf/bpf-next/c/f043733f31e5 - [bpf-next,v9,20/34] bpf: eliminate rlimit-based memory accounting for cpumap maps https://git.kernel.org/bpf/bpf-next/c/711cabaf1432 - [bpf-next,v9,21/34] bpf: eliminate rlimit-based memory accounting for cgroup storage maps https://git.kernel.org/bpf/bpf-next/c/087b0d39fe22 - [bpf-next,v9,22/34] bpf: eliminate rlimit-based memory accounting for devmap maps https://git.kernel.org/bpf/bpf-next/c/844f157f6c0a - [bpf-next,v9,23/34] bpf: eliminate rlimit-based memory accounting for hashtab maps https://git.kernel.org/bpf/bpf-next/c/755e5d55367a - [bpf-next,v9,24/34] bpf: eliminate rlimit-based memory accounting for lpm_trie maps https://git.kernel.org/bpf/bpf-next/c/cbddcb574d41 - [bpf-next,v9,25/34] bpf: eliminate rlimit-based memory accounting for queue_stack_maps maps https://git.kernel.org/bpf/bpf-next/c/a37fb7ef24a4 - [bpf-next,v9,26/34] bpf: eliminate rlimit-based memory accounting for reuseport_array maps https://git.kernel.org/bpf/bpf-next/c/db54330d3e13 - [bpf-next,v9,27/34] bpf: eliminate rlimit-based memory accounting for bpf ringbuffer https://git.kernel.org/bpf/bpf-next/c/abbdd0813f34 - [bpf-next,v9,28/34] bpf: eliminate rlimit-based memory accounting for sockmap and sockhash maps https://git.kernel.org/bpf/bpf-next/c/0d2c4f964050 - [bpf-next,v9,29/34] bpf: eliminate rlimit-based memory accounting for stackmap maps https://git.kernel.org/bpf/bpf-next/c/370868107bf6 - [bpf-next,v9,30/34] bpf: eliminate rlimit-based memory accounting for xskmap maps https://git.kernel.org/bpf/bpf-next/c/819a4f323579 - [bpf-next,v9,31/34] bpf: eliminate rlimit-based memory accounting for bpf local storage maps https://git.kernel.org/bpf/bpf-next/c/ab31be378a63 - [bpf-next,v9,32/34] bpf: eliminate rlimit-based memory accounting infra for bpf maps https://git.kernel.org/bpf/bpf-next/c/80ee81e0403c - [bpf-next,v9,33/34] bpf: eliminate rlimit-based memory accounting for bpf progs https://git.kernel.org/bpf/bpf-next/c/3ac1f01b43b6 - [bpf-next,v9,34/34] bpf: samples: do not touch RLIMIT_MEMLOCK https://git.kernel.org/bpf/bpf-next/c/5b0764b2d345 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
On Tue, Dec 1, 2020 at 1:59 PM Roman Gushchin <guro@fb.com> wrote: > > 5) Cryptic -EPERM is returned on exceeding the limit. Libbpf even had > a function to "explain" this case for users. ... > v9: > - always charge the saved memory cgroup, by Daniel, Toke and Alexei > - added bpf_map_kzalloc() > - rebase and minor fixes This looks great. Applied. Please follow up with a change to libbpf's pr_perm_msg(). That helpful warning should stay for old kernels, but it would be misleading for new kernels. libbpf probably needs a feature check to make this warning conditional. Thanks!
On Wed, Dec 02, 2020 at 06:54:46PM -0800, Alexei Starovoitov wrote: > On Tue, Dec 1, 2020 at 1:59 PM Roman Gushchin <guro@fb.com> wrote: > > > > 5) Cryptic -EPERM is returned on exceeding the limit. Libbpf even had > > a function to "explain" this case for users. > ... > > v9: > > - always charge the saved memory cgroup, by Daniel, Toke and Alexei > > - added bpf_map_kzalloc() > > - rebase and minor fixes > > This looks great. Applied. Thanks! > Please follow up with a change to libbpf's pr_perm_msg(). > That helpful warning should stay for old kernels, but it would be > misleading for new kernels. > libbpf probably needs a feature check to make this warning conditional. I think we've discussed it several months ago and at that time we didn't find a good way to check this feature. I'll think again, but if somebody has any ideas here, I'll appreciate a lot.
On Fri, Dec 4, 2020 at 4:37 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > On 12/3/20 4:26 AM, Roman Gushchin wrote: > > On Wed, Dec 02, 2020 at 06:54:46PM -0800, Alexei Starovoitov wrote: > >> On Tue, Dec 1, 2020 at 1:59 PM Roman Gushchin <guro@fb.com> wrote: > >>> > >>> 5) Cryptic -EPERM is returned on exceeding the limit. Libbpf even had > >>> a function to "explain" this case for users. > >> ... > >>> v9: > >>> - always charge the saved memory cgroup, by Daniel, Toke and Alexei > >>> - added bpf_map_kzalloc() > >>> - rebase and minor fixes > >> > >> This looks great. Applied. > > > > Thanks! > > > >> Please follow up with a change to libbpf's pr_perm_msg(). > >> That helpful warning should stay for old kernels, but it would be > >> misleading for new kernels. > >> libbpf probably needs a feature check to make this warning conditional. > > > > I think we've discussed it several months ago and at that time we didn't > > find a good way to check this feature. I'll think again, but if somebody > > has any ideas here, I'll appreciate a lot. > > Hm, bit tricky, agree .. given we only throw the warning in pr_perm_msg() for > non-root and thus probing options are also limited, otherwise just probing for > a helper that was added in this same cycle would have been good enough as a > simple heuristic. I wonder if it would make sense to add some hint inside the > bpf_{prog,map}_show_fdinfo() to indicate that accounting with memcg is enabled I think the initial version was emitting 0 for memlock, so that was a pretty simple way to prove stuff. But I think it was changed at the last minute to emit some non-zero "estimate" of memory used or something like that? > for the prog/map one way or another? Not just for the sake of pr_perm_msg(), but > in general for apps to stop messing with rlimit at this point. Maybe also bpftool > feature probe could be extended to indicate that as well (e.g. the json output > can be fed into Go natively).