Message ID | 20211012135935.37054-5-lmb@cloudflare.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | Fix up bpf_jit_limit some more | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
bpf/vmtest-bpf | fail | VM_Test |
bpf/vmtest-bpf-PR | fail | PR summary |
bpf/vmtest-bpf-next | success | VM_Test |
bpf/vmtest-bpf-next-PR | success | PR summary |
Le 12/10/2021 à 15:59, Lorenz Bauer a écrit : > Expose bpf_jit_current as a read only value via sysctl. > > Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> > --- [snip] > + { > + .procname = "bpf_jit_current", > + .data = &bpf_jit_current, > + .maxlen = sizeof(long), > + .mode = 0400, Why not 0444 ? Regards, Nicolas
On Tue, 12 Oct 2021 at 17:29, Nicolas Dichtel <nicolas.dichtel@6wind.com> wrote: > > Le 12/10/2021 à 15:59, Lorenz Bauer a écrit : > > Expose bpf_jit_current as a read only value via sysctl. > > > > Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> > > --- > > [snip] > > > + { > > + .procname = "bpf_jit_current", > > + .data = &bpf_jit_current, > > + .maxlen = sizeof(long), > > + .mode = 0400, > Why not 0444 ? This mirrors what the other BPF related sysctls do, which only allow access from root with CAP_SYS_ADMIN. I'd prefer 0444 as well, but Daniel explicitly locked down these sysctls in 2e4a30983b0f9b19b59e38bbf7427d7fdd480d98. Lorenz -- Lorenz Bauer | Systems Engineer 6th Floor, County Hall/The Riverside Building, SE1 7PB, UK www.cloudflare.com
Le 13/10/2021 à 10:35, Lorenz Bauer a écrit : > On Tue, 12 Oct 2021 at 17:29, Nicolas Dichtel <nicolas.dichtel@6wind.com> wrote: >> >> Le 12/10/2021 à 15:59, Lorenz Bauer a écrit : >>> Expose bpf_jit_current as a read only value via sysctl. >>> >>> Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> >>> --- >> >> [snip] >> >>> + { >>> + .procname = "bpf_jit_current", >>> + .data = &bpf_jit_current, >>> + .maxlen = sizeof(long), >>> + .mode = 0400, >> Why not 0444 ? > > This mirrors what the other BPF related sysctls do, which only allow > access from root with CAP_SYS_ADMIN. I'd prefer 0444 as well, but > Daniel explicitly locked down these sysctls in > 2e4a30983b0f9b19b59e38bbf7427d7fdd480d98. Even after this patch, bpf_jit_enable is 0644. In fact, if you have CAP_BPF or CAP_SYS_ADMIN, this value has no impact for your programs. But I you don't have one of these capabilities, it may be rejected, but you cannot read these values, which help to understand why. Regards, Nicolas
diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst index 4150f74c521a..524e7db8d53f 100644 --- a/Documentation/admin-guide/sysctl/net.rst +++ b/Documentation/admin-guide/sysctl/net.rst @@ -123,6 +123,12 @@ compiler in order to reject unprivileged JIT requests once it has been surpassed. bpf_jit_limit contains the value of the global limit in bytes. +bpf_jit_current +--------------- + +The amount of JIT memory currently allocated, in bytes. JITing of +unprivileged BPF is rejected if this value is above bpf_jit_limit. + dev_weight ---------- diff --git a/include/linux/filter.h b/include/linux/filter.h index 8231a6a257f6..42c543a21cd8 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -1051,6 +1051,7 @@ extern int bpf_jit_harden; extern int bpf_jit_kallsyms; extern long bpf_jit_limit; extern long bpf_jit_limit_max; +extern atomic_long_t bpf_jit_current; typedef void (*bpf_jit_fill_hole_t)(void *area, unsigned int size); diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index ab84b3816339..12aedab09222 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -526,6 +526,7 @@ int bpf_jit_kallsyms __read_mostly = IS_BUILTIN(CONFIG_BPF_JIT_DEFAULT_ON); int bpf_jit_harden __read_mostly; long bpf_jit_limit __read_mostly; long bpf_jit_limit_max __read_mostly; +atomic_long_t bpf_jit_current __read_mostly; static void bpf_prog_ksym_set_addr(struct bpf_prog *prog) @@ -801,8 +802,6 @@ int bpf_jit_add_poke_descriptor(struct bpf_prog *prog, return slot; } -static atomic_long_t bpf_jit_current; - /* Can be overridden by an arch's JIT compiler if it has a custom, * dedicated BPF backend memory area, or if neither of the two * below apply. diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c index 5f88526ad61c..78603f561482 100644 --- a/net/core/sysctl_net_core.c +++ b/net/core/sysctl_net_core.c @@ -15,6 +15,7 @@ #include <linux/vmalloc.h> #include <linux/init.h> #include <linux/slab.h> +#include <linux/atomic.h> #include <net/ip.h> #include <net/sock.h> @@ -307,6 +308,22 @@ proc_dolongvec_minmax_bpf_restricted(struct ctl_table *table, int write, return proc_doulongvec_minmax(table, write, buffer, lenp, ppos); } + +static int proc_bpf_jit_current(struct ctl_table *table, int write, + void *buffer, size_t *lenp, loff_t *ppos) +{ + long curr = atomic_long_read(&bpf_jit_current) << PAGE_SHIFT; + struct ctl_table ctl_entry = { + .data = &curr, + .maxlen = sizeof(long), + }; + + + if (!capable(CAP_SYS_ADMIN) || write) + return -EPERM; + + return proc_doulongvec_minmax(&ctl_entry, write, buffer, lenp, ppos); +} #endif static struct ctl_table net_core_table[] = { @@ -421,6 +438,13 @@ static struct ctl_table net_core_table[] = { .extra1 = &long_one, .extra2 = &bpf_jit_limit_max, }, + { + .procname = "bpf_jit_current", + .data = &bpf_jit_current, + .maxlen = sizeof(long), + .mode = 0400, + .proc_handler = proc_bpf_jit_current, + }, #endif { .procname = "netdev_tstamp_prequeue",
Expose bpf_jit_current as a read only value via sysctl. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> --- Documentation/admin-guide/sysctl/net.rst | 6 ++++++ include/linux/filter.h | 1 + kernel/bpf/core.c | 3 +-- net/core/sysctl_net_core.c | 24 ++++++++++++++++++++++++ 4 files changed, 32 insertions(+), 2 deletions(-)