Message ID | VI1PR8303MB008003C9E3B937033A593C47FB150@VI1PR8303MB0080.EURPRD83.prod.outlook.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | [bpf-next] bpf: update verifier to stop perf ring buffer corruption | expand |
On Fri, Oct 30, 2020 at 5:08 AM Kevin Sheldrake <Kevin.Sheldrake@microsoft.com> wrote: > > As discussed, bpf_perf_event_output() takes a u64 for the sample size parameter but the perf ring buffer uses a u16 internally. This results in overlapping samples where the total sample size (including header/padding) exceeds 64K, and prevents samples from being submitted when the total sample size == 64K. > > This patch adds a check to the verifier to force the total sample size to be less than 64K. I'm not convinced it is in the right place stylistically, but it does work. > This is the first patch I've submitted to this list so please forgive me if I'm doing this wrong, and let me know what I should have done. See [0] for some guidelines. I use git format-patch and git send-email for my patch workflow. And please make sure your email client/editor wraps the lines, it's hard to reply if the entire paragraph is one long line. [0] https://kernelnewbies.org/FirstKernelPatch >Also I don't know what the size reduction of -24 relates to (it doesn't match any header struct I've found) but it was found through experimentation. So -24 should have been a clue that something fishy is going on. Look at perf_prepare_sample() in kernel/events/core.c. header->size (which is u16) contains the entire size of the data in the perf event. This includes raw data that you send with bpf_perf_event_output(), but it can also have tons of other stuff (e.g., call stacks, LBR data, etc). What gets added to the perf sample depends on how the perf event was configured in the first place. And it happens automatically on each perf event output. So, all that means that there could be no reliable static check in the verifier which would prevent the corruption. It has to be checked by perf_prepare_sample() in runtime based on the actual size of the sample. We can do an extra check in verifier, but I wouldn't bother because it's never going to be 100% correct. > > Thanks > > Kevin Sheldrake > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h > index e83ef6f..0941731 100644 > --- a/include/linux/bpf_verifier.h > +++ b/include/linux/bpf_verifier.h > @@ -18,6 +18,13 @@ > */ > #define BPF_MAX_VAR_SIZ (1 << 29) > > +/* Maximum variable size permitted for size param to bpf_perf_event_output(). > + * This ensures the samples sent into the perf ring buffer do not overflow the > + * size parameter in the perf event header. > + */ > +#define BPF_PERF_RAW_SIZ_BITS sizeof(((struct perf_event_header *)0)->size) > +#define BPF_MAX_PERF_SAMP_SIZ ((1 << (BPF_PERF_RAW_SIZ_BITS * 8)) - 24) > + [...]
Andrii Nakryiko wrote: > On Fri, Oct 30, 2020 at 5:08 AM Kevin Sheldrake > <Kevin.Sheldrake@microsoft.com> wrote: > > > > As discussed, bpf_perf_event_output() takes a u64 for the sample size parameter but the perf ring buffer uses a u16 internally. This results in overlapping samples where the total sample size (including header/padding) exceeds 64K, and prevents samples from being submitted when the total sample size == 64K. [...] > >Also I don't know what the size reduction of -24 relates to (it doesn't match any header struct I've found) but it was found through experimentation. > > So -24 should have been a clue that something fishy is going on. Look > at perf_prepare_sample() in kernel/events/core.c. header->size (which > is u16) contains the entire size of the data in the perf event. This > includes raw data that you send with bpf_perf_event_output(), but it > can also have tons of other stuff (e.g., call stacks, LBR data, etc). > What gets added to the perf sample depends on how the perf event was > configured in the first place. And it happens automatically on each > perf event output. > > So, all that means that there could be no reliable static check in the > verifier which would prevent the corruption. It has to be checked by > perf_prepare_sample() in runtime based on the actual size of the > sample. We can do an extra check in verifier, but I wouldn't bother > because it's never going to be 100% correct. Please don't add the check in the verifier if its not 100% correct. I think that will confuse readers and make it appear "safe" when it is not. Even if you add a big warning comment there it will make the error case harder to hit. So lets just solve it as Andrii notes. My $.02 Thanks.
> -----Original Message----- > From: Andrii Nakryiko <andrii.nakryiko@gmail.com> > Sent: 30 October 2020 18:20 > To: Kevin Sheldrake <Kevin.Sheldrake@microsoft.com> > Cc: bpf@vger.kernel.org; KP Singh <kpsingh@google.com> > Subject: [EXTERNAL] Re: [PATCH bpf-next] bpf: update verifier to stop perf > ring buffer corruption > [...] > See [0] for some guidelines. I use git format-patch and git send-email for my > patch workflow. And please make sure your email client/editor wraps the > lines, it's hard to reply if the entire paragraph is one long line. Thank you for the references. Hopefully my newly submitted patch (with a new/appropriate subject line) addresses these issues. [...] > So -24 should have been a clue that something fishy is going on. Look at > perf_prepare_sample() in kernel/events/core.c. header->size (which is u16) > contains the entire size of the data in the perf event. This includes raw data > that you send with bpf_perf_event_output(), but it can also have tons of > other stuff (e.g., call stacks, LBR data, etc). > What gets added to the perf sample depends on how the perf event was > configured in the first place. And it happens automatically on each perf event > output. > > So, all that means that there could be no reliable static check in the verifier > which would prevent the corruption. It has to be checked by > perf_prepare_sample() in runtime based on the actual size of the sample. > We can do an extra check in verifier, but I wouldn't bother because it's never > going to be 100% correct. > Thank you again; I've built a check into perf_prepare_sample(). [...]
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index e83ef6f..0941731 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -18,6 +18,13 @@ */ #define BPF_MAX_VAR_SIZ (1 << 29) +/* Maximum variable size permitted for size param to bpf_perf_event_output(). + * This ensures the samples sent into the perf ring buffer do not overflow the + * size parameter in the perf event header. + */ +#define BPF_PERF_RAW_SIZ_BITS sizeof(((struct perf_event_header *)0)->size) +#define BPF_MAX_PERF_SAMP_SIZ ((1 << (BPF_PERF_RAW_SIZ_BITS * 8)) - 24) + /* Liveness marks, used for registers and spilled-regs (in stack slots). * Read marks propagate upwards until they find a write mark; they record that * "one of this state's descendants read this reg" (and therefore the reg is diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 6200519..42211d4 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -4997,7 +4997,7 @@ static int check_reference_leak(struct bpf_verifier_env *env) static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn_idx) { const struct bpf_func_proto *fn = NULL; - struct bpf_reg_state *regs; + struct bpf_reg_state *regs, *reg; struct bpf_call_arg_meta meta; bool changes_data; int i, err; @@ -5054,6 +5054,15 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn return err; } + /* special check for bpf_perf_event_output() size */ + regs = cur_regs(env); + reg = ®s[BPF_REG_5]; + if (func_id == BPF_FUNC_perf_event_output && reg->umax_value >= BPF_MAX_PERF_SAMP_SIZ) { + verbose(env, "bpf_perf_output_event()#%d size parameter must be less than %ld\n", + BPF_FUNC_perf_event_output, BPF_MAX_PERF_SAMP_SIZ); + return -E2BIG; + } + err = record_func_map(env, &meta, func_id, insn_idx); if (err) return err; @@ -5087,8 +5096,6 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn } } - regs = cur_regs(env); - /* check that flags argument in get_local_storage(map, flags) is 0, * this is required because get_local_storage() can't return an error. */