Message ID | 167019256481.3792653.4369637751468386073.stgit@devnote3 (mailing list archive) |
---|---|
State | Rejected |
Delegated to: | BPF |
Headers | show |
Series | [v2] panic: Taint kernel if fault injection has been used | expand |
On Mon, Dec 05, 2022 at 07:22:44AM +0900, Masami Hiramatsu (Google) wrote: > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > Since the function error injection framework in the fault injection > subsystem can change the function code flow forcibly, it may cause > unexpected behavior (and that is the purpose of this feature) even > if it is applied to the ALLOW_ERROR_INJECTION functions. > So this feature must be used only for debugging or testing purpose. The whole idea of tainting for kernel debugging is questionable. There are many other *inject* kconfigs and other debug flags for link lists, RCU, sleeping, etc. None of them taint the kernel. > To identify this in the kernel oops message, add a new taint flag Have you ever seen a single oops message because of this particular error injection? > for the fault injection. This taint flag will be set by either > function error injection is used or the BPF use the kprobe_override > on error injectable functions (identified by ALLOW_ERROR_INJECTION). ... > /* set the new array to event->tp_event and set event->prog */ > + if (prog->kprobe_override) > + add_taint(TAINT_FAULT_INJECTED, LOCKDEP_NOW_UNRELIABLE); Nack for bpf bits.
On Sun, 4 Dec 2022 14:30:01 -0800 Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > On Mon, Dec 05, 2022 at 07:22:44AM +0900, Masami Hiramatsu (Google) wrote: > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > > Since the function error injection framework in the fault injection > > subsystem can change the function code flow forcibly, it may cause > > unexpected behavior (and that is the purpose of this feature) even > > if it is applied to the ALLOW_ERROR_INJECTION functions. > > So this feature must be used only for debugging or testing purpose. > > The whole idea of tainting for kernel debugging is questionable. > There are many other *inject* kconfigs and other debug flags > for link lists, RCU, sleeping, etc. > None of them taint the kernel. > > > To identify this in the kernel oops message, add a new taint flag > > Have you ever seen a single oops message because of this particular > error injection? No, but there is no guarantee that the FEI doesn't cause any issue in the future too. If it happens, we need to know the precise information about what FEI/bpf does. FEI is a kind of temporal Livepatch for testing. If Livepatch taints the kernel, why doesn't the FEI taint it too? > > > for the fault injection. This taint flag will be set by either > > function error injection is used or the BPF use the kprobe_override > > on error injectable functions (identified by ALLOW_ERROR_INJECTION). > > ... > > > /* set the new array to event->tp_event and set event->prog */ > > + if (prog->kprobe_override) > > + add_taint(TAINT_FAULT_INJECTED, LOCKDEP_NOW_UNRELIABLE); > > Nack for bpf bits. I think this is needed especially for bpf bits. If we see this flag, we can ask reporters to share the bpf programs which they used. Thank you,
On Mon, Dec 05, 2022 at 07:59:21AM +0900, Masami Hiramatsu wrote: > On Sun, 4 Dec 2022 14:30:01 -0800 > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > > On Mon, Dec 05, 2022 at 07:22:44AM +0900, Masami Hiramatsu (Google) wrote: > > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > > > > Since the function error injection framework in the fault injection > > > subsystem can change the function code flow forcibly, it may cause > > > unexpected behavior (and that is the purpose of this feature) even > > > if it is applied to the ALLOW_ERROR_INJECTION functions. > > > So this feature must be used only for debugging or testing purpose. > > > > The whole idea of tainting for kernel debugging is questionable. > > There are many other *inject* kconfigs and other debug flags > > for link lists, RCU, sleeping, etc. > > None of them taint the kernel. > > > > > To identify this in the kernel oops message, add a new taint flag > > > > Have you ever seen a single oops message because of this particular > > error injection? > > No, but there is no guarantee that the FEI doesn't cause any issue > in the future too. If it happens, we need to know the precise > information about what FEI/bpf does. > FEI is a kind of temporal Livepatch for testing. If Livepatch taints > the kernel, why doesn't the FEI taint it too? Live patching can replace an arbitrary function and the kernel has no visibility into what KLP module is doing. While 'bpf error injection' is predictable. The functions marked with [BPF_]ALLOW_ERROR_INJECTION can return errors in the normal execution. So the callers of these functions have to deal with errors. If kernel panics on such injected error it potentially would have paniced on it anyway. At this point crash dump might be necessary to debug. Whether oops happened because of bpf, kprobe or normal execution doesn't matter much. The bug is in the caller that wasn't prepared to deal with that error. One can still walk all bpf progs from crash dump with tool "drgn" (it has nice scripts to examine the dumps) or "crash" or other tools. > > > > > for the fault injection. This taint flag will be set by either > > > function error injection is used or the BPF use the kprobe_override > > > on error injectable functions (identified by ALLOW_ERROR_INJECTION). > > > > ... > > > > > /* set the new array to event->tp_event and set event->prog */ > > > + if (prog->kprobe_override) > > > + add_taint(TAINT_FAULT_INJECTED, LOCKDEP_NOW_UNRELIABLE); > > > > Nack for bpf bits. > > I think this is needed especially for bpf bits. If we see this flag, > we can ask reporters to share the bpf programs which they used. You can ask reporters to share bpf progs, but you can repro the oops just as well without bpf. It's not bpf to blame, but the bug in the caller that you should worry about.
On Mon, 5 Dec 2022 18:17:00 -0800 Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > On Mon, Dec 05, 2022 at 07:59:21AM +0900, Masami Hiramatsu wrote: > > On Sun, 4 Dec 2022 14:30:01 -0800 > > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > > > > On Mon, Dec 05, 2022 at 07:22:44AM +0900, Masami Hiramatsu (Google) wrote: > > > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > > > > > > Since the function error injection framework in the fault injection > > > > subsystem can change the function code flow forcibly, it may cause > > > > unexpected behavior (and that is the purpose of this feature) even > > > > if it is applied to the ALLOW_ERROR_INJECTION functions. > > > > So this feature must be used only for debugging or testing purpose. > > > > > > The whole idea of tainting for kernel debugging is questionable. > > > There are many other *inject* kconfigs and other debug flags > > > for link lists, RCU, sleeping, etc. > > > None of them taint the kernel. > > > > > > > To identify this in the kernel oops message, add a new taint flag > > > > > > Have you ever seen a single oops message because of this particular > > > error injection? > > > > No, but there is no guarantee that the FEI doesn't cause any issue > > in the future too. If it happens, we need to know the precise > > information about what FEI/bpf does. > > FEI is a kind of temporal Livepatch for testing. If Livepatch taints > > the kernel, why doesn't the FEI taint it too? > > Live patching can replace an arbitrary function and the kernel has > no visibility into what KLP module is doing. > While 'bpf error injection' is predictable. No, not much predictable because the kernel code can be changed. > The functions marked with [BPF_]ALLOW_ERROR_INJECTION can return errors > in the normal execution. So the callers of these functions have to deal with errors. Right, but it might change something before checking the input, and if it rejects the sane input, the caller may go into unexpected status (e.g. the caller already checked input value, and does not expect the call is fail). Such behaviors are buggy, yes. And the FEI is designed for finding such buggy behavior. (e.g. injecting error, but the caller passed successfully, it means the caller code has some issue.) > If kernel panics on such injected error it potentially would have paniced > on it anyway. Yes, but that doesn't cover all cases. If the function doesn't have any internal state but returns an error according to the input, FEI can make it return an error even if the input is correct. And if it cause a kernel panic, that is a panic that must not happen without FEI. Thus, the ALLOW_ERROR_INJECTION should only be applied to the function which has so-called 'side-effect', e.g. memory allocation, external data (except for input data) read, etc. that could cause an error regardless of the input value. Then the caller must handle such errors. > At this point crash dump might be necessary to debug. Yes. So the TAINT flag can help. Please consider that the TAINT flag doesn't mean you are guilty, but this is just a hint for debugging. (good for the first triage) > Whether oops happened because of bpf, kprobe or normal execution > doesn't matter much. The bug is in the caller that wasn't prepared > to deal with that error. > > One can still walk all bpf progs from crash dump with tool "drgn" > (it has nice scripts to examine the dumps) or "crash" or other tools. > > > > > > > > for the fault injection. This taint flag will be set by either > > > > function error injection is used or the BPF use the kprobe_override > > > > on error injectable functions (identified by ALLOW_ERROR_INJECTION). > > > > > > ... > > > > > > > /* set the new array to event->tp_event and set event->prog */ > > > > + if (prog->kprobe_override) > > > > + add_taint(TAINT_FAULT_INJECTED, LOCKDEP_NOW_UNRELIABLE); > > > > > > Nack for bpf bits. > > > > I think this is needed especially for bpf bits. If we see this flag, > > we can ask reporters to share the bpf programs which they used. > > You can ask reporters to share bpf progs, but you can repro > the oops just as well without bpf. It's not bpf to blame, but the > bug in the caller that you should worry about. I don't blame the bpf, but just it points that undesigned behavior has been injected. So we have to take it into account. Thank you,
On Tue, Dec 06, 2022 at 04:20:35PM +0900, Masami Hiramatsu wrote: > On Mon, 5 Dec 2022 18:17:00 -0800 > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > > On Mon, Dec 05, 2022 at 07:59:21AM +0900, Masami Hiramatsu wrote: > > > On Sun, 4 Dec 2022 14:30:01 -0800 > > > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > > > > > > On Mon, Dec 05, 2022 at 07:22:44AM +0900, Masami Hiramatsu (Google) wrote: > > > > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > > > > > > > > Since the function error injection framework in the fault injection > > > > > subsystem can change the function code flow forcibly, it may cause > > > > > unexpected behavior (and that is the purpose of this feature) even > > > > > if it is applied to the ALLOW_ERROR_INJECTION functions. > > > > > So this feature must be used only for debugging or testing purpose. > > > > > > > > The whole idea of tainting for kernel debugging is questionable. > > > > There are many other *inject* kconfigs and other debug flags > > > > for link lists, RCU, sleeping, etc. > > > > None of them taint the kernel. > > > > > > > > > To identify this in the kernel oops message, add a new taint flag > > > > > > > > Have you ever seen a single oops message because of this particular > > > > error injection? > > > > > > No, but there is no guarantee that the FEI doesn't cause any issue > > > in the future too. If it happens, we need to know the precise > > > information about what FEI/bpf does. > > > FEI is a kind of temporal Livepatch for testing. If Livepatch taints > > > the kernel, why doesn't the FEI taint it too? > > > > Live patching can replace an arbitrary function and the kernel has > > no visibility into what KLP module is doing. > > While 'bpf error injection' is predictable. > > No, not much predictable because the kernel code can be changed. > > > The functions marked with [BPF_]ALLOW_ERROR_INJECTION can return errors > > in the normal execution. So the callers of these functions have to deal with errors. > > Right, but it might change something before checking the input, and > if it rejects the sane input, the caller may go into unexpected > status (e.g. the caller already checked input value, and does not > expect the call is fail). Such behaviors are buggy, yes. And the > FEI is designed for finding such buggy behavior. > (e.g. injecting error, but the caller passed successfully, it > means the caller code has some issue.) > > > If kernel panics on such injected error it potentially would have paniced > > on it anyway. > > Yes, but that doesn't cover all cases. If the function doesn't have > any internal state but returns an error according to the input, > FEI can make it return an error even if the input is correct. > And if it cause a kernel panic, that is a panic that must not > happen without FEI. > > Thus, the ALLOW_ERROR_INJECTION should only be applied to the > function which has so-called 'side-effect', e.g. memory allocation, > external data (except for input data) read, etc. that could cause > an error regardless of the input value. Then the caller must > handle such errors. Not quite. I think you're confusing functions with 'side effect' with 'pure' functions. Your point about 'checking args before the call' applies to pure functions. We have some too: git grep __pure. Most of the time the compiler can identify the functions that return the same value with the same args, but sometimes it needs help and we mark such functions. Clearly such functions should never be marked as 'error inject' because changing return value of such function might lead to wrong code and _it will not be a kernel bug_. The compiler optimizations rely on the function being pure. Live kernel patching should be very careful with __pure functions too. No idea whether they do this now or not. Of course, there is a category of functions (with or without side effects) which return values should not be changed by error injecton mechanism. That's a given. Applying ALLOW_ERROR_INJECTION should not to be taken lightly. > > > At this point crash dump might be necessary to debug. > > Yes. So the TAINT flag can help. Please consider that the TAINT flag > doesn't mean you are guilty, but this is just a hint for debugging. > (good for the first triage) I think you misunderstand the reason behind 'tainted' flags. It's 'hint for debugging' only on the surface. See Documentation/admin-guide/tainted-kernels.rst "... That's why bug reports from tainted kernels will often be ignored by developers, hence try to reproduce problems with an untainted kernel." When 'error injection' finds a kernel bug the kernel developers need to look into it regardless whether it's syzbot error injection or whatever other mechanism. To change the topic to something ... else... We've just hit this panic using rethook. [ 49.235708] ================================================================== [ 49.236243] BUG: KASAN: use-after-free in rethook_try_get+0x7e/0x380 [ 49.236693] Read of size 8 at addr ffff888102e62c88 by task test_progs/1688 [ 49.240398] kasan_report+0x90/0x190 [ 49.240934] rethook_try_get+0x7e/0x380 [ 49.244885] fprobe_handler.part.1+0x119/0x1f0 [ 49.245505] arch_ftrace_ops_list_func+0x17d/0x1d0 [ 49.246544] ftrace_regs_call+0x5/0x52 [ 49.247411] bpf_fentry_test1+0x5/0x10 [ 49.262578] Allocated by task 1692: [ 49.262804] kasan_save_stack+0x1c/0x40 [ 49.263059] kasan_set_track+0x21/0x30 [ 49.263335] __kasan_kmalloc+0x7a/0x90 [ 49.263624] rethook_alloc+0x2c/0xa0 [ 49.263879] fprobe_init_rethook+0x6d/0x170 [ 49.264154] register_fprobe_ips+0xae/0x130 [ 49.265938] Freed by task 0: [ 49.266153] kasan_save_stack+0x1c/0x40 [ 49.266440] kasan_set_track+0x21/0x30 [ 49.266705] kasan_save_free_info+0x26/0x40 [ 49.266995] __kasan_slab_free+0x103/0x190 [ 49.267282] __kmem_cache_free+0x1b7/0x3a0 [ 49.267559] rcu_core+0x4d8/0xd50 [ 49.268181] Last potentially related work creation: [ 49.268565] kasan_save_stack+0x1c/0x40 [ 49.268898] __kasan_record_aux_stack+0xa1/0xb0 [ 49.269260] call_rcu+0x47/0x360 [ 49.269526] unregister_fprobe+0x47/0x80 [ 49.281382] general protection fault, probably for non-canonical address 0x57e006e00000000: 0000 [#1] PREEMPT SMP KASAN [ 49.282226] CPU: 6 PID: 1688 Comm: test_progs Tainted: G B O 6.1.0-rc7-01508-gf0c5a2d9f234 #4343 [ 49.283751] RIP: 0010:rethook_trampoline_handler+0xff/0x1d0 [ 49.289900] Call Trace: [ 49.290083] <TASK> [ 49.290248] arch_rethook_trampoline_callback+0x6c/0xa0 [ 49.290631] arch_rethook_trampoline+0x2c/0x50 [ 49.290964] ? lock_release+0xad/0x3f0 [ 49.291245] ? bpf_prog_test_run_tracing+0x235/0x380 [ 49.291609] trace_clock_x86_tsc+0x10/0x10 This is just running bpf selftests in parallel mode on 16-cpu VM on bpf-next. Notice 'Tained' flags. Please take a look. Thanks!
On Tue, 6 Dec 2022 20:01:46 -0800 Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > > > > At this point crash dump might be necessary to debug. > > > > Yes. So the TAINT flag can help. Please consider that the TAINT flag > > doesn't mean you are guilty, but this is just a hint for debugging. > > (good for the first triage) > > I think you misunderstand the reason behind 'tainted' flags. > It's 'hint for debugging' only on the surface. > See Documentation/admin-guide/tainted-kernels.rst > "... That's why bug reports > from tainted kernels will often be ignored by developers, hence try to reproduce > problems with an untainted kernel." You conveniently left out the first part of that paragraph. Showing just a portion of a statement can be very misleading. Let me add the whole paragraph here: The kernel will mark itself as 'tainted' when something occurs that might be relevant later when investigating problems. Don't worry too much about this, most of the time it's not a problem to run a tainted kernel; the information is mainly of interest once someone wants to investigate some problem, as its real cause might be the event that got the kernel tainted. That's why bug reports from tainted kernels will often be ignored by developers, hence try to reproduce problems with an untainted kernel. Let me stress the very first sentence above: The kernel will mark itself as 'tainted' when something occurs that might be relevant later when investigating problems. I think you are the one that is misunderstanding what a taint is. It most definitely is about giving hints for debugging. That's why the very first sentence of that paragraph, as well as the entire document, explicitly states "might be relevant later when investigating problems". > > When 'error injection' finds a kernel bug the kernel developers need to > look into it regardless whether it's syzbot error injection > or whatever other mechanism. > And this is a very useful taint. Just like: 2 _/S 4 kernel running on an out of specification system 5 _/B 32 bad page referenced or some unexpected page flags 7 _/D 128 kernel died recently, i.e. there was an OOPS or BUG 10 _/C 1024 staging driver was loaded 11 _/I 2048 workaround for bug in platform firmware applied 14 _/L 16384 soft lockup occurred 17 _/T 131072 kernel was built with the struct randomization plugin Any of the above should not be ignored by developers, but they are useful hints for debugging the issue. > To change the topic to something ... else... > > We've just hit this panic using rethook. > [ 49.235708] ================================================================== > [ 49.236243] BUG: KASAN: use-after-free in rethook_try_get+0x7e/0x380 > [ 49.236693] Read of size 8 at addr ffff888102e62c88 by task test_progs/1688 > [ 49.240398] kasan_report+0x90/0x190 > [ 49.240934] rethook_try_get+0x7e/0x380 > [ 49.244885] fprobe_handler.part.1+0x119/0x1f0 > [ 49.245505] arch_ftrace_ops_list_func+0x17d/0x1d0 > [ 49.246544] ftrace_regs_call+0x5/0x52 > [ 49.247411] bpf_fentry_test1+0x5/0x10 > > [ 49.262578] Allocated by task 1692: > [ 49.262804] kasan_save_stack+0x1c/0x40 > [ 49.263059] kasan_set_track+0x21/0x30 > [ 49.263335] __kasan_kmalloc+0x7a/0x90 > [ 49.263624] rethook_alloc+0x2c/0xa0 > [ 49.263879] fprobe_init_rethook+0x6d/0x170 > [ 49.264154] register_fprobe_ips+0xae/0x130 > > [ 49.265938] Freed by task 0: > [ 49.266153] kasan_save_stack+0x1c/0x40 > [ 49.266440] kasan_set_track+0x21/0x30 > [ 49.266705] kasan_save_free_info+0x26/0x40 > [ 49.266995] __kasan_slab_free+0x103/0x190 > [ 49.267282] __kmem_cache_free+0x1b7/0x3a0 > [ 49.267559] rcu_core+0x4d8/0xd50 > > [ 49.268181] Last potentially related work creation: > [ 49.268565] kasan_save_stack+0x1c/0x40 > [ 49.268898] __kasan_record_aux_stack+0xa1/0xb0 > [ 49.269260] call_rcu+0x47/0x360 > [ 49.269526] unregister_fprobe+0x47/0x80 > > [ 49.281382] general protection fault, probably for non-canonical address 0x57e006e00000000: 0000 [#1] PREEMPT SMP KASAN > [ 49.282226] CPU: 6 PID: 1688 Comm: test_progs Tainted: G B O 6.1.0-rc7-01508-gf0c5a2d9f234 #4343 > [ 49.283751] RIP: 0010:rethook_trampoline_handler+0xff/0x1d0 > [ 49.289900] Call Trace: > [ 49.290083] <TASK> > [ 49.290248] arch_rethook_trampoline_callback+0x6c/0xa0 > [ 49.290631] arch_rethook_trampoline+0x2c/0x50 > [ 49.290964] ? lock_release+0xad/0x3f0 > [ 49.291245] ? bpf_prog_test_run_tracing+0x235/0x380 > [ 49.291609] trace_clock_x86_tsc+0x10/0x10 > > This is just running bpf selftests in parallel mode on 16-cpu VM on bpf-next. > Notice 'Tained' flags. > Please take a look. > "G - Proprietary module" - "O - out of tree module" Can you reproduce this without those taints? -- Steve
On Tue, 6 Dec 2022 23:39:47 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:
> "G - Proprietary module" - "O - out of tree module"
Sorry, "P" is proprietary, "G" is still GPL. But it is an out of tree
module still. ;-)
-- Steve
On Tue, Dec 6, 2022 at 8:39 PM Steven Rostedt <rostedt@goodmis.org> wrote: > > > > [ 49.281382] general protection fault, probably for non-canonical address 0x57e006e00000000: 0000 [#1] PREEMPT SMP KASAN > > [ 49.282226] CPU: 6 PID: 1688 Comm: test_progs Tainted: G B O 6.1.0-rc7-01508-gf0c5a2d9f234 #4343 > > [ 49.283751] RIP: 0010:rethook_trampoline_handler+0xff/0x1d0 > > [ 49.289900] Call Trace: > > [ 49.290083] <TASK> > > [ 49.290248] arch_rethook_trampoline_callback+0x6c/0xa0 > > [ 49.290631] arch_rethook_trampoline+0x2c/0x50 > > [ 49.290964] ? lock_release+0xad/0x3f0 > > [ 49.291245] ? bpf_prog_test_run_tracing+0x235/0x380 > > [ 49.291609] trace_clock_x86_tsc+0x10/0x10 > > > > This is just running bpf selftests in parallel mode on 16-cpu VM on bpf-next. > > Notice 'Tained' flags. > > Please take a look. > > > > "G - Proprietary module" - "O - out of tree module" > > Can you reproduce this without those taints? Lol. That question is exactly the reason why my Nack stands.
On December 6, 2022 11:45:17 PM EST, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: >On Tue, Dec 6, 2022 at 8:39 PM Steven Rostedt <rostedt@goodmis.org> wrote: >> > >> > [ 49.281382] general protection fault, probably for non-canonical address 0x57e006e00000000: 0000 [#1] PREEMPT SMP KASAN >> > [ 49.282226] CPU: 6 PID: 1688 Comm: test_progs Tainted: G B O 6.1.0-rc7-01508-gf0c5a2d9f234 #4343 >> > [ 49.283751] RIP: 0010:rethook_trampoline_handler+0xff/0x1d0 >> > [ 49.289900] Call Trace: >> > [ 49.290083] <TASK> >> > [ 49.290248] arch_rethook_trampoline_callback+0x6c/0xa0 >> > [ 49.290631] arch_rethook_trampoline+0x2c/0x50 >> > [ 49.290964] ? lock_release+0xad/0x3f0 >> > [ 49.291245] ? bpf_prog_test_run_tracing+0x235/0x380 >> > [ 49.291609] trace_clock_x86_tsc+0x10/0x10 >> > >> > This is just running bpf selftests in parallel mode on 16-cpu VM on bpf-next. >> > Notice 'Tained' flags. >> > Please take a look. >> > >> >> "G - Proprietary module" - "O - out of tree module" >> >> Can you reproduce this without those taints? > >Lol. That question is exactly the reason why my Nack stands. I only said the above *because* of your comment ;-) -- Steve
On Tue, 6 Dec 2022 20:45:17 -0800 Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > "G - Proprietary module" - "O - out of tree module" > > > > Can you reproduce this without those taints? > > Lol. That question is exactly the reason why my Nack stands. First, that's a BS reason for a NACK. But in all seriousness, what I would actually ask (and what I'll ask now) is, what module did you use that is out of tree, and was it relevant to this test? That's a reasonable question to ask, and one that only gets asked with a taint. If there's a BPF injection taint, one can ask that same question, as the bug may happen sometime after the injection but be caused by that injection, and not be in the backtrace. Not all kernel developers have the access to debugging utilities that backend production servers have. A lot of bugs that kernel developers debug are from someone's laptop. Where all they have is that backtrace. If a tool or root kit, added function error injection, it would be extremely useful information to debug what happened. I don't understand the push back here. -- Steve
On Wed, Dec 07, 2022 at 07:48:06AM -0500, Steven Rostedt wrote: > On Tue, 6 Dec 2022 20:45:17 -0800 > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > > > "G - Proprietary module" - "O - out of tree module" > > > > > > Can you reproduce this without those taints? > > > > Lol. That question is exactly the reason why my Nack stands. > > First, that's a BS reason for a NACK. > > But in all seriousness, what I would actually ask (and what I'll ask now) > is, what module did you use that is out of tree, and was it relevant to > this test? > > That's a reasonable question to ask, and one that only gets asked with a > taint. When we receive bug reports in bpf@vger the only question we ask is: "How to reproduce this bug?" We ignoring taint completely. tbh I didn't even know that our BPF CI causes taint until that email. In the previous email I said that the bug report comes from bpf selftest on bpf-next. Clearly there is no ambiguity that this is as upstream as it can get. Yet for 2 days this 'taint' arguing is preventing people from looking at the bug. And that happens all the time on lkml. Somebody reports a bug and kernel devs jump on the poor person: "Can you repro without taint?", "Can you repro with upstream kernel?" This is discouraging. The 'taint' concept makes it easier for kernel devs to ignore bug reports and push back on the reporter. Do it few times and people stop reporting bugs. Say, this particular bug in rethook was found by one of our BPF CI developers. They're not very familiar with the kernel, but they can see plenty of 'rethook' references in the stack trace, lookup MAINTAINER file and ping Massami, but to the question "can you repro without taint?" they can only say NO, because this is how our CI works. So they will keep silence and the bug will be lost. That's not the only reason why I'm against generalizing 'taint'. Tainting because HW is misbehaving makes sense, but tainting because of OoO module or because of live-patching does not. It becomes an excuse that people abuse. Right now syzbot is finding all sorts of bugs. Most of the time syzbot turns error injection on to find those allocation issues. If syzbot reports will start coming as tainted there will be even less attention to them. That will not be good. > If there's a BPF injection taint, one can ask that same question, as the > bug may happen sometime after the injection but be caused by that injection, > and not be in the backtrace. Not all kernel developers have the access to > debugging utilities that backend production servers have. A lot of bugs that > kernel developers debug are from someone's laptop. I would have to disagree. We see few human reported bugs and prioritize them higher, but majority are coming from the production fleet, test tiers, syzbot, CIs, and automated things. > Where all they have is > that backtrace. Even laptops typically leave vmcore after crash. distro support packages have clear rules what scripts to run to collect all the necessary data in case of the crash from vmcore. These tools support extracting everything bpf related too. For example see: Documentation/bpf/drgn.rst It works on vmcore and on the running kernel. > If a tool or root kit, added function error injection, it > would be extremely useful information to debug what happened. > > I don't understand the push back here. All these years we've been working on improving bpf introspection and debuggability. Today crash dumps look like this: bpf_trace_printk+0xd3/0x170 kernel/trace/bpf_trace.c:377 bpf_prog_cf2ac6d483d8499b_trace_bpf_trace_printk+0x2b/0x37 bpf_dispatcher_nop_func include/linux/bpf.h:1082 [inline] __bpf_prog_run include/linux/filter.h:600 [inline] bpf_prog_run include/linux/filter.h:607 [inline] The 2nd from the top is a bpf prog. The rest are kernel functions. bpf_prog_cf2ac6d483d8499b_trace_bpf_trace_printk ^^ is a prog tag ^^ name of bpf prog If you do 'bpftool prog show' you can see both tag and name. 'bpftool prog dump jited' dumps x86 code mixed with source line text. Often enough +0x2b offset will have some C code right next to it. One can monitor all prog load/unload via perf or via audit. 'perf record' collects all bpf progs that were loaded before the start and even thouse that were loaded and unloaded quickly while 'perf record' was running. So 'perf report' has all the data including annotation and source code. Currently we're working on adding 'file.c:lineno' to dump_stack() for bpf progs. If you have ideas how we can improve debuggability we are all ears.
On Wed, 7 Dec 2022 20:36:28 -0800 Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > But in all seriousness, what I would actually ask (and what I'll ask now) > > is, what module did you use that is out of tree, and was it relevant to > > this test? > > > > That's a reasonable question to ask, and one that only gets asked with a > > taint. > > When we receive bug reports in bpf@vger the only question we ask is: > "How to reproduce this bug?" > We ignoring taint completely. Um, I'm guessing that bug reports that go to bpf@vger are focused on issues with BPF and not core kernel infrastructure. No, I do not consider BPF (nor tracing for that matter) as core infrastructure. The bug reports I'm talking about is for the scheduler, exception handling, interrupts, VFS, MM, and networking. When one of these have a bug report, you most definitely do look at taints. There's been several times where a bug report was caused by an out of tree module, or a machine check exception. > tbh I didn't even know that our BPF CI causes taint until that email. That's because BPF development probably never gets hit by things that cause taints. That doesn't go for the rest of the kernel. > > In the previous email I said that the bug report comes from bpf selftest on bpf-next. > Clearly there is no ambiguity that this is as upstream as it can get. > Yet for 2 days this 'taint' arguing is preventing people from looking at the bug. Who isn't looking at the bug because of the 'taint' argument? I'm not looking at it right now, because it isn't top of my priority list. > And that happens all the time on lkml. Somebody reports a bug and kernel devs > jump on the poor person: > "Can you repro without taint?", > "Can you repro with upstream kernel?" > This is discouraging. So is spending 5 days debugging something and then finding out that what caused the taint *was* the culprit! Sorry, but that has happened to me too many times. Which is why you get grumpy kernel developers pushing back on this. > The 'taint' concept makes it easier for kernel devs to ignore bug reports > and push back on the reporter. Do it few times and people stop reporting bugs. > Say, this particular bug in rethook was found by one of our BPF CI developers. > They're not very familiar with the kernel, but they can see plenty of 'rethook' > references in the stack trace, lookup MAINTAINER file and ping Massami, > but to the question "can you repro without taint?" they can only say NO, > because this is how our CI works. So they will keep silence and the bug will be lost. > That's not the only reason why I'm against generalizing 'taint'. > Tainting because HW is misbehaving makes sense, but tainting because > of OoO module or because of live-patching does not. > It becomes an excuse that people abuse. Again, I've been burned by OoO modules more than once. If you send 5 days debugging something to find out that the cause was from an OoO module, you'd change your tune too. > > Right now syzbot is finding all sorts of bugs. Most of the time syzbot > turns error injection on to find those allocation issues. > If syzbot reports will start coming as tainted there will be even less > attention to them. That will not be good. > > > If there's a BPF injection taint, one can ask that same question, as the > > bug may happen sometime after the injection but be caused by that injection, > > and not be in the backtrace. Not all kernel developers have the access to > > debugging utilities that backend production servers have. A lot of bugs that > > kernel developers debug are from someone's laptop. > > I would have to disagree. > We see few human reported bugs and prioritize them higher, > but majority are coming from the production fleet, test tiers, > syzbot, CIs, and automated things. > > > Where all they have is > > that backtrace. > > Even laptops typically leave vmcore after crash. distro support packages have > clear rules what scripts to run to collect all the necessary data in case > of the crash from vmcore. They are not as common as you think. > These tools support extracting everything bpf related too. > For example see: > Documentation/bpf/drgn.rst > It works on vmcore and on the running kernel. > > > If a tool or root kit, added function error injection, it > > would be extremely useful information to debug what happened. > > > > I don't understand the push back here. > > All these years we've been working on improving bpf introspection and > debuggability. Today crash dumps look like this: > bpf_trace_printk+0xd3/0x170 kernel/trace/bpf_trace.c:377 > bpf_prog_cf2ac6d483d8499b_trace_bpf_trace_printk+0x2b/0x37 > bpf_dispatcher_nop_func include/linux/bpf.h:1082 [inline] > __bpf_prog_run include/linux/filter.h:600 [inline] > bpf_prog_run include/linux/filter.h:607 [inline] > > The 2nd from the top is a bpf prog. The rest are kernel functions. > bpf_prog_cf2ac6d483d8499b_trace_bpf_trace_printk > ^^ is a prog tag ^^ name of bpf prog > > If you do 'bpftool prog show' you can see both tag and name. > 'bpftool prog dump jited' > dumps x86 code mixed with source line text. > Often enough +0x2b offset will have some C code right next to it. > > One can monitor all prog load/unload via perf or via audit. > 'perf record' collects all bpf progs that were loaded before > the start and even thouse that were loaded and unloaded quickly > while 'perf record' was running. > So 'perf report' has all the data including annotation and source code. > > Currently we're working on adding 'file.c:lineno' to dump_stack() > for bpf progs. > > If you have ideas how we can improve debuggability we are all ears. I talked with KP on IRC, and he said he's going to work on the same thing as above. That is, showing what BPF programs are loaded, and if they modify the return of any function, as well as what those functions are. And have that reported in the oops message. I'm not hell bent on BPF triggering a taint, as long as the oops message explicitly defines what BPF is doing. I think that's much more informative than a taint, and I would greatly welcome that. The main issue I'm raising is that I want the oops message to have most the information to gather what went wrong, and not rely on other tooling like kernel core dumps and drgn for debugging the problem. -- Steve
Hi Alexei, On Wed, 7 Dec 2022 20:36:28 -0800 Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > Yet for 2 days this 'taint' arguing is preventing people from looking at the bug. > And that happens all the time on lkml. Somebody reports a bug and kernel devs > jump on the poor person: > "Can you repro without taint?", > "Can you repro with upstream kernel?" > This is discouraging. > The 'taint' concept makes it easier for kernel devs to ignore bug reports > and push back on the reporter. > Do it few times and people stop reporting bugs. That seems off topic for me. You seems complained against the taint flag itself. > Say, this particular bug in rethook was found by one of our BPF CI developers. > They're not very familiar with the kernel, but they can see plenty of 'rethook' > references in the stack trace, lookup MAINTAINER file and ping Massami, > but to the question "can you repro without taint?" they can only say NO, > because this is how our CI works. So they will keep silence and the bug will be lost. BTW, this sounds like the BPF CI system design issue. If user is NOT easily identifying what test caused the issue (e.g. what tests ran on the system until the bug was found), the CI system is totally useless, because after finding a problem, it must be investigated to solve the problem. Without investigation, how would you usually fix the bug?? > That's not the only reason why I'm against generalizing 'taint'. > Tainting because HW is misbehaving makes sense, but tainting because > of OoO module or because of live-patching does not. > It becomes an excuse that people abuse. yeah, it is possible to be abused. but that is the problem who abuse it. > Right now syzbot is finding all sorts of bugs. Most of the time syzbot > turns error injection on to find those allocation issues. > If syzbot reports will start coming as tainted there will be even less > attention to them. That will not be good. Hmm, what kind of error injection does syzbot do? I would like to know how it is used. For example, does that use only a specify set of injection points, or use all existing points? If the latter, I feel safer because syzbot ensures the current all ALLOW_ERROR_INJECTION() functions will work with error injection. If not, we need to consider removing the ALLOW_ERROR_INJECTION() from the function which is not tested well (or add this taint flag.) Documentation/fault-injection/fault-injection.rst has no explanation about ALLOW_ERROR_INJECTION(), but obviously the ALLOW_ERROR_INJECTION() marked functions and its caller MUST be designed safely against the error injection. e.g. - It must return an error code. (so EI_ETYPE_NONE must be removed) - Caller must check the return value always. (but I thought this was the reason why we need this test framework...) - It should not run any 'effective' code before checking an error. For example, increment counter, call other functions etc. (this means it can return without any side-effect) Anything else? [...] > All these years we've been working on improving bpf introspection and > debuggability. Today crash dumps look like this: > bpf_trace_printk+0xd3/0x170 kernel/trace/bpf_trace.c:377 > bpf_prog_cf2ac6d483d8499b_trace_bpf_trace_printk+0x2b/0x37 > bpf_dispatcher_nop_func include/linux/bpf.h:1082 [inline] > __bpf_prog_run include/linux/filter.h:600 [inline] > bpf_prog_run include/linux/filter.h:607 [inline] > > The 2nd from the top is a bpf prog. The rest are kernel functions. > bpf_prog_cf2ac6d483d8499b_trace_bpf_trace_printk > ^^ is a prog tag ^^ name of bpf prog > > If you do 'bpftool prog show' you can see both tag and name. > 'bpftool prog dump jited' > dumps x86 code mixed with source line text. > Often enough +0x2b offset will have some C code right next to it. This is good, but this only works when the vmcore is dumped and on the stack. My concern about the function error injection is that makes some side effects, which can cause a problem afterwards (this means after unloading the bpf prog) > > One can monitor all prog load/unload via perf or via audit. Ah, audit is helpful :), because we can dig the log what was loaded before crash. Thank you,
On Sun, Dec 11, 2022 at 3:52 AM Masami Hiramatsu <mhiramat@kernel.org> wrote: > > Hi Alexei, > > On Wed, 7 Dec 2022 20:36:28 -0800 > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > > Yet for 2 days this 'taint' arguing is preventing people from looking at the bug. > > And that happens all the time on lkml. Somebody reports a bug and kernel devs > > jump on the poor person: > > "Can you repro without taint?", > > "Can you repro with upstream kernel?" > > This is discouraging. > > The 'taint' concept makes it easier for kernel devs to ignore bug reports > > and push back on the reporter. > > Do it few times and people stop reporting bugs. > > That seems off topic for me. You seems complained against the taint flag > itself. The series is about adding a taint for, so discussing the user experience, when someone reports a "tainted crash" seems reasonable to me and not off topic. > > > Say, this particular bug in rethook was found by one of our BPF CI developers. > > They're not very familiar with the kernel, but they can see plenty of 'rethook' > > references in the stack trace, lookup MAINTAINER file and ping Massami, > > but to the question "can you repro without taint?" they can only say NO, > > because this is how our CI works. So they will keep silence and the bug will be lost. > > BTW, this sounds like the BPF CI system design issue. If user is NOT easily > identifying what test caused the issue (e.g. what tests ran on the system > until the bug was found), the CI system is totally useless, because after > finding a problem, it must be investigated to solve the problem. > > Without investigation, how would you usually fix the bug?? Masami, this seems accusational and counter productive, it was never said that issues can be solved without investigation. The BPF CI does find issues, the BPF reviewers and maintainers regularly fix bugs using it. Alexei's point here is that a taint does not help in solving the problem, rather deter some people from even looking at it. (not BPF people, but other maintainers [distro, kernel] who would ask for a reproduction without a taint). Let's take a step back and focus on solving debuggability and introspection as we clearly have some perception issues about taints in the community. (distro maintainers, users) before we go and add more taints. > > > That's not the only reason why I'm against generalizing 'taint'. > > Tainting because HW is misbehaving makes sense, but tainting because > > of OoO module or because of live-patching does not. > > It becomes an excuse that people abuse. > > yeah, it is possible to be abused. but that is the problem who > abuse it. I am sorry, but it's our responsibility as developers to design features so that users don't face arduous pushbacks. > > Right now syzbot is finding all sorts of bugs. Most of the time syzbot > > turns error injection on to find those allocation issues. > > If syzbot reports will start coming as tainted there will be even less > > attention to them. That will not be good. > > Hmm, what kind of error injection does syzbot do? I would like to know > how it is used. For example, does that use only a specify set of > injection points, or use all existing points? > > If the latter, I feel safer because syzbot ensures the current all > ALLOW_ERROR_INJECTION() functions will work with error injection. If not, > we need to consider removing the ALLOW_ERROR_INJECTION() from the > function which is not tested well (or add this taint flag.) > > Documentation/fault-injection/fault-injection.rst has no explanation > about ALLOW_ERROR_INJECTION(), but obviously the ALLOW_ERROR_INJECTION() > marked functions and its caller MUST be designed safely against the > error injection. e.g. > > - It must return an error code. (so EI_ETYPE_NONE must be removed) This is already the case with BPF, the modify return trampolines further limits the error injection to functions that return errors. > - Caller must check the return value always. > (but I thought this was the reason why we need this test framework...) > - It should not run any 'effective' code before checking an error. > For example, increment counter, call other functions etc. > (this means it can return without any side-effect) This is the case with modify_return trampolines in BPF which avoid side effects and limit the attachment surface further and avoiding side effects is a design goal. If we missed anything, let's fix that. https://lwn.net/Articles/813724/https://lwn.net/Articles/813724/ > > Anything else? > > [...] > > All these years we've been working on improving bpf introspection and > > debuggability. Today crash dumps look like this: > > bpf_trace_printk+0xd3/0x170 kernel/trace/bpf_trace.c:377 > > bpf_prog_cf2ac6d483d8499b_trace_bpf_trace_printk+0x2b/0x37 > > bpf_dispatcher_nop_func include/linux/bpf.h:1082 [inline] > > __bpf_prog_run include/linux/filter.h:600 [inline] > > bpf_prog_run include/linux/filter.h:607 [inline] > > > > The 2nd from the top is a bpf prog. The rest are kernel functions. > > bpf_prog_cf2ac6d483d8499b_trace_bpf_trace_printk > > ^^ is a prog tag ^^ name of bpf prog > > > > If you do 'bpftool prog show' you can see both tag and name. > > 'bpftool prog dump jited' > > dumps x86 code mixed with source line text. > > Often enough +0x2b offset will have some C code right next to it. > > This is good, but this only works when the vmcore is dumped and > on the stack. My concern about the function error injection is > that makes some side effects, which can cause a problem afterwards > (this means after unloading the bpf prog) I think careful choices need to be made on when error injection is allowed so that these situations don't occur. (as you mentioned in your comment). [1]. If a BPF program is unloaded, there is no error injection any more, let's ensure that we design the error injection allow list and the BPF logic to ensure this cannot happen. > > > > > One can monitor all prog load/unload via perf or via audit. I would like us to focus on debuggability as it helps both the maintainers and the user. And I see a few things that need to be done: 1. Revisit what is allowed for error injection in the kernel and if they can cause any subtle issues. My initial take is that functions that are directly called from syscall path should generally be okay. But let's check them for the patterns you mentioned. 2. If it helps, add the list of BPF modify return programs to stack traces. Although this is really needed if we don't do [1] properly. 3. Check if anything needs to be improved in the verification logic for modify return trampolines. - KP > > Ah, audit is helpful :), because we can dig the log what was loaded > before crash. > > > Thank you, > > -- > Masami Hiramatsu (Google) <mhiramat@kernel.org>
On Sun, 11 Dec 2022 08:49:01 +0100 KP Singh <kpsingh@kernel.org> wrote: > On Sun, Dec 11, 2022 at 3:52 AM Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > > Hi Alexei, > > > > On Wed, 7 Dec 2022 20:36:28 -0800 > > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > > > > Yet for 2 days this 'taint' arguing is preventing people from looking at the bug. > > > And that happens all the time on lkml. Somebody reports a bug and kernel devs > > > jump on the poor person: > > > "Can you repro without taint?", > > > "Can you repro with upstream kernel?" > > > This is discouraging. > > > The 'taint' concept makes it easier for kernel devs to ignore bug reports > > > and push back on the reporter. > > > Do it few times and people stop reporting bugs. > > > > That seems off topic for me. You seems complained against the taint flag > > itself. > > The series is about adding a taint for, so discussing the user > experience, when someone reports a "tainted crash" seems reasonable to > me and not off topic. > > > > > > Say, this particular bug in rethook was found by one of our BPF CI developers. > > > They're not very familiar with the kernel, but they can see plenty of 'rethook' > > > references in the stack trace, lookup MAINTAINER file and ping Massami, > > > but to the question "can you repro without taint?" they can only say NO, > > > because this is how our CI works. So they will keep silence and the bug will be lost. > > > > BTW, this sounds like the BPF CI system design issue. If user is NOT easily > > identifying what test caused the issue (e.g. what tests ran on the system > > until the bug was found), the CI system is totally useless, because after > > finding a problem, it must be investigated to solve the problem. > > > > Without investigation, how would you usually fix the bug?? > > Masami, this seems accusational and counter productive, it was never > said that issues can be solved without investigation. Let me apologies about my misunderstanding. > > The BPF CI does find issues, the BPF reviewers and maintainers > regularly fix bugs using it. Alexei's point here is that a taint does > not help in solving the problem, rather deter some people from even > looking at it. (not BPF people, but other maintainers [distro, kernel] > who would ask for a reproduction without a taint). Hmm, that is a problem. Some taint flag should be useful hints for finding the error patterns. > Let's take a step back and focus on solving debuggability and > introspection as we clearly have some perception issues about taints > in the community. (distro maintainers, users) before we go and add > more taints. Agreed. > > > That's not the only reason why I'm against generalizing 'taint'. > > > Tainting because HW is misbehaving makes sense, but tainting because > > > of OoO module or because of live-patching does not. > > > It becomes an excuse that people abuse. > > > > yeah, it is possible to be abused. but that is the problem who > > abuse it. > > I am sorry, but it's our responsibility as developers to design > features so that users don't face arduous pushbacks. Sorry if I confuse you. I meant that taint flag abusing. :( > > > > Right now syzbot is finding all sorts of bugs. Most of the time syzbot > > > turns error injection on to find those allocation issues. > > > If syzbot reports will start coming as tainted there will be even less > > > attention to them. That will not be good. > > > > Hmm, what kind of error injection does syzbot do? I would like to know > > how it is used. For example, does that use only a specify set of > > injection points, or use all existing points? > > > > If the latter, I feel safer because syzbot ensures the current all > > ALLOW_ERROR_INJECTION() functions will work with error injection. If not, > > we need to consider removing the ALLOW_ERROR_INJECTION() from the > > function which is not tested well (or add this taint flag.) > > > > Documentation/fault-injection/fault-injection.rst has no explanation > > about ALLOW_ERROR_INJECTION(), but obviously the ALLOW_ERROR_INJECTION() > > marked functions and its caller MUST be designed safely against the > > error injection. e.g. > > > > - It must return an error code. (so EI_ETYPE_NONE must be removed) > > This is already the case with BPF, the modify return trampolines > further limits the error injection to functions that return errors. OK, so I also should remove it from FEI. > > > - Caller must check the return value always. > > (but I thought this was the reason why we need this test framework...) > > - It should not run any 'effective' code before checking an error. > > For example, increment counter, call other functions etc. > > (this means it can return without any side-effect) > > This is the case with modify_return trampolines in BPF which avoid > side effects and limit the attachment surface further and avoiding > side effects is a design goal. If we missed anything, let's fix that. > > https://lwn.net/Articles/813724/ Yeah, if BPF tests already tested all ALLOW_ERROR_INJECTION() functions, it may be checked already. I think we just need adding the above explanation on the document, so that the people who will add additional ALLOW_ERROR_INJECTION() on a function, can understand the limitation. > > > > > Anything else? > > > > [...] > > > All these years we've been working on improving bpf introspection and > > > debuggability. Today crash dumps look like this: > > > bpf_trace_printk+0xd3/0x170 kernel/trace/bpf_trace.c:377 > > > bpf_prog_cf2ac6d483d8499b_trace_bpf_trace_printk+0x2b/0x37 > > > bpf_dispatcher_nop_func include/linux/bpf.h:1082 [inline] > > > __bpf_prog_run include/linux/filter.h:600 [inline] > > > bpf_prog_run include/linux/filter.h:607 [inline] > > > > > > The 2nd from the top is a bpf prog. The rest are kernel functions. > > > bpf_prog_cf2ac6d483d8499b_trace_bpf_trace_printk > > > ^^ is a prog tag ^^ name of bpf prog > > > > > > If you do 'bpftool prog show' you can see both tag and name. > > > 'bpftool prog dump jited' > > > dumps x86 code mixed with source line text. > > > Often enough +0x2b offset will have some C code right next to it. > > > > This is good, but this only works when the vmcore is dumped and > > on the stack. My concern about the function error injection is > > that makes some side effects, which can cause a problem afterwards > > (this means after unloading the bpf prog) > > I think careful choices need to be made on when error injection is > allowed so that these situations don't occur. (as you mentioned in > your comment). [1]. If a BPF program is unloaded, there is no error > injection any more, let's ensure that we design the error injection > allow list and the BPF logic to ensure this cannot happen. OK. Actually, I trust the BPF logic itself will be handle this correctly. I just concerned that some people who don't know much (because it is not carefully documented) might add ALLOW_ERROR_INJECTION() to a function which is not injectable by design. Thus I thought the taint flag can help. But if those are always Cc'd to bpf@vger and it will be tested by BPF CI, I'm OK for that. > > > One can monitor all prog load/unload via perf or via audit. > > I would like us to focus on debuggability as it helps both the > maintainers and the user. And I see a few things that need to be done: > > 1. Revisit what is allowed for error injection in the kernel and if > they can cause any subtle issues. My initial take is that functions > that are directly called from syscall path should generally be okay. > But let's check them for the patterns you mentioned. Yeah, I agree that syscall entries should be safe. > 2. If it helps, add the list of BPF modify return programs to stack > traces. Although this is really needed if we don't do [1] properly. Would you mean a list of enabled BPF programs in Oops code? If so, I also want to add enabled FEI list on it. > 3. Check if anything needs to be improved in the verification logic > for modify return trampolines. I think BPF logic itself is safe. But the targeted function itself or the caller may not be designed for such error injection. I think this is my fault that I have not documented about ALLOW_ERROR_INJECTION() well. Sorry about that. Thank you,
On Sun, 11 Dec 2022 08:49:01 +0100 KP Singh <kpsingh@kernel.org> wrote: > Let's take a step back and focus on solving debuggability and > introspection as we clearly have some perception issues about taints > in the community. (distro maintainers, users) before we go and add > more taints. Note, you will likely get the same push back if the dump includes bpf programs known to change the return of a function that may be involved with the bug report. That is, if a crash is reported to code I maintain, and I see that the bug report includes a list of BPF programs that can modify the return of a function, and one of those functions could affect the place that crashed, I'd push back and ask if the crash could be done without that BPF program loaded, regardless of taints. I agree that a taint is just a hint and it can include something that caused the bug or it may not. I would like to see more details in how the crashed kernel was configured. That includes loaded BPF programs (just like we include loaded modules). And if any BPF program modifies a core function (outside of syscall returns) I'd be a bit suspect of what happened. I also agree that if a function that checks error paths fails, it should be fixed, but knowing that the error path was caused by fault injection will prevent the wasted effort that most developers will go through to find out why the error path was hit in the first place. -- Steve
Hi, On Sun, 11 Dec 2022 08:49:01 +0100 KP Singh <kpsingh@kernel.org> wrote: > 1. Revisit what is allowed for error injection in the kernel and if > they can cause any subtle issues. My initial take is that functions > that are directly called from syscall path should generally be okay. > But let's check them for the patterns you mentioned. > 2. If it helps, add the list of BPF modify return programs to stack > traces. Although this is really needed if we don't do [1] properly. > 3. Check if anything needs to be improved in the verification logic > for modify return trampolines. Hmm, I found that bpf might not check the acceptable error type of each ALLOW_ERROR_INJECTION(). Except for EI_ETYPE_NONE, we have 4 types of the error. EI_ETYPE_NULL, /* Return NULL if failure */ EI_ETYPE_ERRNO, /* Return -ERRNO if failure */ EI_ETYPE_ERRNO_NULL, /* Return -ERRNO or NULL if failure */ EI_ETYPE_TRUE, /* Return true if failure */ These specifies that what return value will be treated as an error by the caller. If bpf trampoline only expect that the function will return -errno in error cases, bpf should check the error type as below. etype = get_injectable_error_type(addr); if (etype != EI_ETYPE_ERRNO && etype != EI_ETYPE_ERRNO_NULL) /* reject it */ If bpf can handle any case, it still need to verify that the user bpf prog specifies correct return value for each type. See adjust_error_retval()@kernel/fail_function.c for the available return values. Thank you,
On Sun, Dec 11, 2022 at 4:14 PM Masami Hiramatsu <mhiramat@kernel.org> wrote: > > On Sun, 11 Dec 2022 08:49:01 +0100 > KP Singh <kpsingh@kernel.org> wrote: > > > On Sun, Dec 11, 2022 at 3:52 AM Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > > > > Hi Alexei, > > > > > > On Wed, 7 Dec 2022 20:36:28 -0800 > > > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > > > > > > Yet for 2 days this 'taint' arguing is preventing people from looking at the bug. > > > > And that happens all the time on lkml. Somebody reports a bug and kernel devs > > > > jump on the poor person: > > > > "Can you repro without taint?", > > > > "Can you repro with upstream kernel?" > > > > This is discouraging. > > > > The 'taint' concept makes it easier for kernel devs to ignore bug reports > > > > and push back on the reporter. > > > > Do it few times and people stop reporting bugs. > > > > > > That seems off topic for me. You seems complained against the taint flag > > > itself. > > > > The series is about adding a taint for, so discussing the user > > experience, when someone reports a "tainted crash" seems reasonable to > > me and not off topic. > > > > > > > > > Say, this particular bug in rethook was found by one of our BPF CI developers. > > > > They're not very familiar with the kernel, but they can see plenty of 'rethook' > > > > references in the stack trace, lookup MAINTAINER file and ping Massami, > > > > but to the question "can you repro without taint?" they can only say NO, > > > > because this is how our CI works. So they will keep silence and the bug will be lost. > > > > > > BTW, this sounds like the BPF CI system design issue. If user is NOT easily > > > identifying what test caused the issue (e.g. what tests ran on the system > > > until the bug was found), the CI system is totally useless, because after > > > finding a problem, it must be investigated to solve the problem. > > > > > > Without investigation, how would you usually fix the bug?? > > > > Masami, this seems accusational and counter productive, it was never > > said that issues can be solved without investigation. > > Let me apologies about my misunderstanding. e-mail is hard, I am glad this is already progressing constructively and we are making things better. > > > > > The BPF CI does find issues, the BPF reviewers and maintainers > > regularly fix bugs using it. Alexei's point here is that a taint does > > not help in solving the problem, rather deter some people from even > > looking at it. (not BPF people, but other maintainers [distro, kernel] > > who would ask for a reproduction without a taint). > > Hmm, that is a problem. Some taint flag should be useful hints > for finding the error patterns. the boolean information contained in taints is not helpful, stuff like audit logs / crash dumps / bpftool reports are much more helpful. > > > Let's take a step back and focus on solving debuggability and > > introspection as we clearly have some perception issues about taints > > in the community. (distro maintainers, users) before we go and add > > more taints. > > Agreed. > > > > > That's not the only reason why I'm against generalizing 'taint'. > > > > Tainting because HW is misbehaving makes sense, but tainting because > > > > of OoO module or because of live-patching does not. > > > > It becomes an excuse that people abuse. > > > > > > yeah, it is possible to be abused. but that is the problem who > > > abuse it. > > > > I am sorry, but it's our responsibility as developers to design > > features so that users don't face arduous pushbacks. > > Sorry if I confuse you. I meant that taint flag abusing. :( Again no worries, but we need to make sure that the taint flag is not abused, and if it is being abused, limit the damage somewhere. > > > > > > > Right now syzbot is finding all sorts of bugs. Most of the time syzbot > > > > turns error injection on to find those allocation issues. > > > > If syzbot reports will start coming as tainted there will be even less > > > > attention to them. That will not be good. > > > > > > Hmm, what kind of error injection does syzbot do? I would like to know > > > how it is used. For example, does that use only a specify set of > > > injection points, or use all existing points? > > > > > > If the latter, I feel safer because syzbot ensures the current all > > > ALLOW_ERROR_INJECTION() functions will work with error injection. If not, > > > we need to consider removing the ALLOW_ERROR_INJECTION() from the > > > function which is not tested well (or add this taint flag.) > > > > > > Documentation/fault-injection/fault-injection.rst has no explanation > > > about ALLOW_ERROR_INJECTION(), but obviously the ALLOW_ERROR_INJECTION() > > > marked functions and its caller MUST be designed safely against the > > > error injection. e.g. > > > > > > - It must return an error code. (so EI_ETYPE_NONE must be removed) > > > > This is already the case with BPF, the modify return trampolines > > further limits the error injection to functions that return errors. > > OK, so I also should remove it from FEI. > > > > > > - Caller must check the return value always. > > > (but I thought this was the reason why we need this test framework...) > > > - It should not run any 'effective' code before checking an error. > > > For example, increment counter, call other functions etc. > > > (this means it can return without any side-effect) > > > > This is the case with modify_return trampolines in BPF which avoid > > side effects and limit the attachment surface further and avoiding > > side effects is a design goal. If we missed anything, let's fix that. > > > > https://lwn.net/Articles/813724/ > > Yeah, if BPF tests already tested all ALLOW_ERROR_INJECTION() functions, > it may be checked already. I think we just need adding the above > explanation on the document, so that the people who will add additional > ALLOW_ERROR_INJECTION() on a function, can understand the limitation. > > > > > > > > > Anything else? > > > > > > [...] > > > > All these years we've been working on improving bpf introspection and > > > > debuggability. Today crash dumps look like this: > > > > bpf_trace_printk+0xd3/0x170 kernel/trace/bpf_trace.c:377 > > > > bpf_prog_cf2ac6d483d8499b_trace_bpf_trace_printk+0x2b/0x37 > > > > bpf_dispatcher_nop_func include/linux/bpf.h:1082 [inline] > > > > __bpf_prog_run include/linux/filter.h:600 [inline] > > > > bpf_prog_run include/linux/filter.h:607 [inline] > > > > > > > > The 2nd from the top is a bpf prog. The rest are kernel functions. > > > > bpf_prog_cf2ac6d483d8499b_trace_bpf_trace_printk > > > > ^^ is a prog tag ^^ name of bpf prog > > > > > > > > If you do 'bpftool prog show' you can see both tag and name. > > > > 'bpftool prog dump jited' > > > > dumps x86 code mixed with source line text. > > > > Often enough +0x2b offset will have some C code right next to it. > > > > > > This is good, but this only works when the vmcore is dumped and > > > on the stack. My concern about the function error injection is > > > that makes some side effects, which can cause a problem afterwards > > > (this means after unloading the bpf prog) > > > > I think careful choices need to be made on when error injection is > > allowed so that these situations don't occur. (as you mentioned in > > your comment). [1]. If a BPF program is unloaded, there is no error > > injection any more, let's ensure that we design the error injection > > allow list and the BPF logic to ensure this cannot happen. > > OK. Actually, I trust the BPF logic itself will be handle this > correctly. I just concerned that some people who don't know much > (because it is not carefully documented) might add ALLOW_ERROR_INJECTION() > to a function which is not injectable by design. Thus I thought the > taint flag can help. We should definitely carefully review any new ALLOW_ERROR_INJECTION functions, that's the real value add. > But if those are always Cc'd to bpf@vger and it will be tested by BPF > CI, I'm OK for that. > > > > > One can monitor all prog load/unload via perf or via audit. > > > > I would like us to focus on debuggability as it helps both the > > maintainers and the user. And I see a few things that need to be done: > > > > 1. Revisit what is allowed for error injection in the kernel and if > > they can cause any subtle issues. My initial take is that functions > > that are directly called from syscall path should generally be okay. > > But let's check them for the patterns you mentioned. > > Yeah, I agree that syscall entries should be safe. > > > 2. If it helps, add the list of BPF modify return programs to stack > > traces. Although this is really needed if we don't do [1] properly. > > Would you mean a list of enabled BPF programs in Oops code? If so, > I also want to add enabled FEI list on it. > > > 3. Check if anything needs to be improved in the verification logic > > for modify return trampolines. > > I think BPF logic itself is safe. But the targeted function itself > or the caller may not be designed for such error injection. > I think this is my fault that I have not documented about > ALLOW_ERROR_INJECTION() well. Sorry about that. We aim to be blameless and constructive :) Thanks again for sending the patches! Cheers, - KP > > Thank you, > > > -- > Masami Hiramatsu (Google) <mhiramat@kernel.org>
On Sun, Dec 11, 2022 at 6:02 PM Steven Rostedt <rostedt@goodmis.org> wrote: > > On Sun, 11 Dec 2022 08:49:01 +0100 > KP Singh <kpsingh@kernel.org> wrote: > > > Let's take a step back and focus on solving debuggability and > > introspection as we clearly have some perception issues about taints > > in the community. (distro maintainers, users) before we go and add > > more taints. > > Note, you will likely get the same push back if the dump includes bpf > programs known to change the return of a function that may be involved > with the bug report. That is, if a crash is reported to code I > maintain, and I see that the bug report includes a list of BPF programs > that can modify the return of a function, and one of those functions > could affect the place that crashed, I'd push back and ask if the crash > could be done without that BPF program loaded, regardless of taints. > I think this is already better as it gives the recipient of the stack trace more information to ask more questions and see if the BPF programs are relevant to the stack trace and engage further. > I agree that a taint is just a hint and it can include something that > caused the bug or it may not. I would like to see more details in how > the crashed kernel was configured. That includes loaded BPF programs > (just like we include loaded modules). And if any BPF program modifies > a core function (outside of syscall returns) I'd be a bit suspect of > what happened. Agreed. > > I also agree that if a function that checks error paths fails, it > should be fixed, but knowing that the error path was caused by fault > injection will prevent the wasted effort that most developers will go > through to find out why the error path was hit in the first place. Agreed. > > -- Steve
diff --git a/Documentation/admin-guide/tainted-kernels.rst b/Documentation/admin-guide/tainted-kernels.rst index 92a8a07f5c43..63d7cd4f6250 100644 --- a/Documentation/admin-guide/tainted-kernels.rst +++ b/Documentation/admin-guide/tainted-kernels.rst @@ -101,6 +101,7 @@ Bit Log Number Reason that got the kernel tainted 16 _/X 65536 auxiliary taint, defined for and used by distros 17 _/T 131072 kernel was built with the struct randomization plugin 18 _/N 262144 an in-kernel test has been run + 19 _/J 524288 a function-level error has been injected === === ====== ======================================================== Note: The character ``_`` is representing a blank in this table to make reading @@ -182,3 +183,7 @@ More detailed explanation for tainting produce extremely unusual kernel structure layouts (even performance pathological ones), which is important to know when debugging. Set at build time. + + 19) ``J`` if a function-level error has been injected and the code path was + forcibly changed by either function error injection framework or BPF's + function override feature. diff --git a/include/linux/panic.h b/include/linux/panic.h index c7759b3f2045..2b03a02d86be 100644 --- a/include/linux/panic.h +++ b/include/linux/panic.h @@ -69,7 +69,8 @@ static inline void set_arch_panic_timeout(int timeout, int arch_default_timeout) #define TAINT_AUX 16 #define TAINT_RANDSTRUCT 17 #define TAINT_TEST 18 -#define TAINT_FLAGS_COUNT 19 +#define TAINT_FAULT_INJECTED 19 +#define TAINT_FLAGS_COUNT 20 #define TAINT_FLAGS_MAX ((1UL << TAINT_FLAGS_COUNT) - 1) struct taint_flag { diff --git a/kernel/fail_function.c b/kernel/fail_function.c index a7ccd2930c5f..80a743f14a2c 100644 --- a/kernel/fail_function.c +++ b/kernel/fail_function.c @@ -9,6 +9,7 @@ #include <linux/kprobes.h> #include <linux/module.h> #include <linux/mutex.h> +#include <linux/panic.h> #include <linux/slab.h> #include <linux/uaccess.h> @@ -298,6 +299,7 @@ static ssize_t fei_write(struct file *file, const char __user *buffer, fei_attr_free(attr); goto out; } + add_taint(TAINT_FAULT_INJECTED, LOCKDEP_NOW_UNRELIABLE); fei_debugfs_add_attr(attr); list_add_tail(&attr->list, &fei_attr_list); ret = count; diff --git a/kernel/panic.c b/kernel/panic.c index da323209f583..e396a5fd9bb6 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -426,6 +426,7 @@ const struct taint_flag taint_flags[TAINT_FLAGS_COUNT] = { [ TAINT_AUX ] = { 'X', ' ', true }, [ TAINT_RANDSTRUCT ] = { 'T', ' ', true }, [ TAINT_TEST ] = { 'N', ' ', true }, + [ TAINT_FAULT_INJECTED ] = { 'J', ' ', false }, }; /** diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 1ed08967fb97..de0614d9796c 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -2137,6 +2137,8 @@ int perf_event_attach_bpf_prog(struct perf_event *event, goto unlock; /* set the new array to event->tp_event and set event->prog */ + if (prog->kprobe_override) + add_taint(TAINT_FAULT_INJECTED, LOCKDEP_NOW_UNRELIABLE); event->prog = prog; event->bpf_cookie = bpf_cookie; rcu_assign_pointer(event->tp_event->prog_array, new_array); diff --git a/tools/debugging/kernel-chktaint b/tools/debugging/kernel-chktaint index 279be06332be..5eb563766041 100755 --- a/tools/debugging/kernel-chktaint +++ b/tools/debugging/kernel-chktaint @@ -204,6 +204,14 @@ else echo " * an in-kernel test (such as a KUnit test) has been run (#18)" fi +T=`expr $T / 2` +if [ `expr $T % 2` -eq 0 ]; then + addout " " +else + addout "J" + echo " * a function level error has been injected (#19)" +fi + echo "For a more detailed explanation of the various taint flags see" echo " Documentation/admin-guide/tainted-kernels.rst in the Linux kernel sources" echo " or https://kernel.org/doc/html/latest/admin-guide/tainted-kernels.html"