[2/2] moderr: add module error injection tool

Message ID	20250122-modules-error-injection-v1-2-910590a04fd5@samsung.com (mailing list archive)
State	New
Headers	show Received: from mailout1.w1.samsung.com (mailout1.w1.samsung.com [210.118.77.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D9BAF212FB0 for <linux-modules@vger.kernel.org>; Wed, 22 Jan 2025 13:12:01 +0000 (UTC) From: Daniel Gomez <da.gomez@samsung.com> Date: Wed, 22 Jan 2025 14:11:14 +0100 Subject: [PATCH 2/2] moderr: add module error injection tool Precedence: bulk MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-ID: <20250122-modules-error-injection-v1-2-910590a04fd5@samsung.com> In-Reply-To: <20250122-modules-error-injection-v1-0-910590a04fd5@samsung.com> To: Luis Chamberlain <mcgrof@kernel.org>, Petr Pavlu <petr.pavlu@suse.com>, Sami Tolvanen <samitolvanen@google.com>, Alexei Starovoitov <ast@kernel.org>, Daniel Borkmann <daniel@iogearbox.net>, Andrii Nakryiko <andrii@kernel.org>, Martin KaFai Lau <martin.lau@linux.dev>, Eduard Zingerman <eddyz87@gmail.com>, Song Liu <song@kernel.org>, Yonghong Song <yonghong.song@linux.dev>, John Fastabend <john.fastabend@gmail.com>, "KP Singh" <kpsingh@kernel.org>, Stanislav Fomichev <sdf@fomichev.me>, Hao Luo <haoluo@google.com>, Jiri Olsa <jolsa@kernel.org>, Nathan Chancellor <nathan@kernel.org>, Nick Desaulniers <ndesaulniers@google.com>, "Bill Wendling" <morbo@google.com>, Justin Stitt <justinstitt@google.com> CC: <linux-modules@vger.kernel.org>, <linux-kernel@vger.kernel.org>, <bpf@vger.kernel.org>, <llvm@lists.linux.dev>, <iovisor-dev@lists.iovisor.org>, <gost.dev@samsung.com>, Daniel Gomez <da.gomez@samsung.com> CMS-TYPE: 201P References: <20250122-modules-error-injection-v1-0-910590a04fd5@samsung.com> <CGME20250122131159eucas1p17693e311a9b7674288eb3c34014b6f2c@eucas1p1.samsung.com>
Series	[1/2] module: allow for module error injection \| expand [1/2] module: allow for module error injection [2/2] moderr: add module error injection tool

Context	Check	Description
mcgrof/vmtest-modules-next-VM_Test-0	success	Logs for Run CI tests
mcgrof/vmtest-modules-next-VM_Test-1	success	Logs for Run CI tests
mcgrof/vmtest-modules-next-VM_Test-5	fail	Logs for setup / Setup kdevops environment
mcgrof/vmtest-modules-next-VM_Test-4	fail	Logs for setup / Setup kdevops environment
mcgrof/vmtest-modules-next-VM_Test-2	fail	Logs for cleanup / Archive results and cleanup
mcgrof/vmtest-modules-next-VM_Test-3	fail	Logs for cleanup / Archive results and cleanup
mcgrof/vmtest-modules-next-PR	fail	PR summary

On Wed, Mar 05, 2025 at 03:08:25PM -0600, Lucas De Marchi wrote: > +Francois who added most of the error injection points in xe > > On Wed, Mar 05, 2025 at 11:06:57AM +0100, Daniel Gomez wrote: > > On Fri, Feb 28, 2025 at 12:48:38PM +0100, Lucas De Marchi wrote: > > > On Fri, Feb 28, 2025 at 10:27:17AM +0100, Daniel Gomez wrote: > > > > On Mon, Feb 24, 2025 at 08:43:45AM +0100, Lucas De Marchi wrote: > > > > > On Sat, Feb 22, 2025 at 10:35:07PM +0100, Daniel Gomez wrote: > > > > > > On Fri, Feb 21, 2025 at 12:15:40PM +0100, Luis Chamberlain wrote: > > > > > > > On Wed, Feb 19, 2025 at 02:17:48PM -0600, Lucas De Marchi wrote: > > > > > > > > On Tue, Jan 28, 2025 at 12:57:05PM -0800, Luis Chamberlain wrote: > > > > > > > > > On Wed, Jan 22, 2025 at 09:02:19AM -0800, Alexei Starovoitov wrote: > > > > > > > > > > On Wed, Jan 22, 2025 at 5:12 AM Daniel Gomez <da.gomez@samsung.com> wrote: > > > > > > > > > > > > > > > > > > > > > > Add support for a module error injection tool. The tool > > > > > > > > > > > can inject errors in the annotated module kernel functions > > > > > > > > > > > such as complete_formation(), do_init_module() and > > > > > > > > > > > module_enable_rodata_after_init(). Module name and module function are > > > > > > > > > > > required parameters to have control over the error injection. > > > > > > > > > > > > > > > > > > > > > > Example: Inject error -22 to module_enable_rodata_ro_after_init for > > > > > > > > > > > brd module: > > > > > > > > > > > > > > > > > > > > > > sudo moderr --modname=brd --modfunc=module_enable_rodata_ro_after_init \ > > > > > > > > > > > --error=-22 --trace > > > > > > > > > > > Monitoring module error injection... Hit Ctrl-C to end. > > > > > > > > > > > MODULE ERROR FUNCTION > > > > > > > > > > > brd -22 module_enable_rodata_after_init() > > > > > > > > > > > > > > > > > > > > > > Kernel messages: > > > > > > > > > > > [ 89.463690] brd: module loaded > > > > > > > > > > > [ 89.463855] brd: module_enable_rodata_ro_after_init() returned -22, > > > > > > > > > > > ro_after_init data might still be writable > > > > > > > > > > > > > > > > > > > > > > Signed-off-by: Daniel Gomez <da.gomez@samsung.com> > > > > > > > > > > > --- > > > > > > > > > > > tools/bpf/Makefile | 13 ++- > > > > > > > > > > > tools/bpf/moderr/.gitignore | 2 + > > > > > > > > > > > tools/bpf/moderr/Makefile | 95 +++++++++++++++++ > > > > > > > > > > > tools/bpf/moderr/moderr.bpf.c | 127 +++++++++++++++++++++++ > > > > > > > > > > > tools/bpf/moderr/moderr.c | 236 ++++++++++++++++++++++++++++++++++++++++++ > > > > > > > > > > > tools/bpf/moderr/moderr.h | 40 +++++++ > > > > > > > > > > > 6 files changed, 510 insertions(+), 3 deletions(-) > > > > > > > > > > > > > > > > > > > > The tool looks useful, but we don't add tools to the kernel repo. > > > > > > > > > > It has to stay out of tree. > > > > > > > > > > > > > > > > > > For selftests we do add random tools. > > > > > > > > > > > > > > > > > > > The value of error injection is not clear to me. > > > > > > > > > > > > > > > > > > It is of great value, since it deals with corner cases which are > > > > > > > > > otherwise hard to reproduce in places which a real error can be > > > > > > > > > catostrophic. > > > > > > > > > > > > > > > > > > > Other places in the kernel use it to test paths in the kernel > > > > > > > > > > that are difficult to do otherwise. > > > > > > > > > > > > > > > > > > Right. > > > > > > > > > > > > > > > > > > > These 3 functions don't seem to be in this category. > > > > > > > > > > > > > > > > > > That's the key here we should focus on. The problem is when a maintainer > > > > > > > > > *does* agree that adding an error injection entry is useful for testing, > > > > > > > > > and we have a developer willing to do the work to help test / validate > > > > > > > > > it. In this case, this error case is rare but we do want to strive to > > > > > > > > > test this as we ramp up and extend our modules selftests. > > > > > > > > > > > > > > > > > > Then there is the aspect of how to mitigate how instrusive code changes > > > > > > > > > to allow error injection are. In 2021 we evaluated the prospect of error > > > > > > > > > injection in-kernel long ago for other areas like the block layer for > > > > > > > > > add_disk() failures [0] but the minimal interface to enable this from > > > > > > > > > userspace with debugfs was considered just too intrusive. > > > > > > > > > > > > > > > > > > This effort tried to evaluate what this could look like with eBPF to > > > > > > > > > mitigate the required in-kernel code, and I believe the light weight > > > > > > > > > nature of it by just requiring a sprinkle with ALLOW_ERROR_INJECTION() > > > > > > > > > suffices to my taste. > > > > > > > > > > > > > > > > > > So, perhaps the tools aspect can just go in: > > > > > > > > > > > > > > > > > > tools/testing/selftests/module/ > > > > > > > > > > > > > > > > but why would it be module-specific? > > > > > > > > > > > > > > Gotta start somewhere. > > > > > > > > > > > > > > > Based on its current implementation > > > > > > > > and discussion about inject.py it seems to be generic enough to be > > > > > > > > useful to test any function annotated with ALLOW_ERROR_INJECTION(). > > > > > > > > > > > > > > > > As xe driver maintainer, it may be interesting to use such a tool: > > > > > > > > > > > > > > > > $ git grep ALLOW_ERROR_INJECT -- drivers/gpu/drm/xe | wc -l 23 > > > > > > > > > > > > > > > > How does this approach compare to writing the function name on debugfs > > > > > > > > (the current approach in xe's testsuite)? > > > > > > > > > > > > > > > > fail_function @ https://docs.kernel.org/fault-injection/fault-injection.html#fault-injection-capabilities-infrastructure > > > > > > > > https://gitlab.freedesktop.org/drm/igt-gpu-tools/-/blob/master/tests/intel/xe_fault_injection.c?ref_type=heads#L108 > > > > > > > > > > > > > > > > If you decide to have the tool to live somewhere else, then kmod repo > > > > > > > > could be a candidate. > > > > > > > > > > > > > > Would we install this upon install target? > > > > > > > > > > > > > > Danny can decide on this :) > > > > > > > > > > > > > > > Although I think having it in kernel tree is > > > > > > > > simpler maintenance-wise. > > > > > > > > > > > > > > I think we have at least two users upstream who can make use of it. If > > > > > > > we end up going through tools/testing/selftests/module/ first, can't > > > > > > > you make use of it later? > > > > > > > > > > > > What are the features in debugfs required to be useful for xe that we can > > > > > > port to an eBPF version? I see from the link provided the use of probability, > > > > > > interval, times and space but these are configured to allways trigger the error. > > > > > > Is that right? > > > > > > > > > > I don't think we use them... we just set them to "always trigger" and > > > > > then create the conditions for that to happen. But my question still > > > > > remains: what is the benefit of using the bpf approach over writing > > > > > these files? > > > > > > > > The code to trigger the error injection still needs to exist with both > > > > approaches. My understanding from debugfs and the comment from Luis earlier in > > > > the thread is that with eBPF you can mitigate how intrusive in-kernel code > > > > changes are to allow error injection. Luis added the example here [1] for > > > > debugfs. > > > > > > > > [1] https://lore.kernel.org/all/20210512064629.13899-9-mcgrof@kernel.org/ > > > > > > > > To compare patch 8 in [1] with eBPF approach: the patch describes > > > > all the necessary changes required to allow error injection on the > > > > add_disk() path. With eBPF one would simply annotate the function(s) with > > > > ALLOW_ERROR_INJECTION(), e.g. device_add() and replace the return value > > > > in eBPF code with bpf_override_return() as implemented in moderr tool for > > > > module_enable_rdata_after_init() for example. > > > > > > but that is all that we need with the fail_function in debugfs too: > > > > > > $ git grep ALLOW_ERROR_INJECTION -- drivers/gpu/drm/xe > > > drivers/gpu/drm/xe/xe_device.c:ALLOW_ERROR_INJECTION(xe_device_create, ERRNO); /* See xe_pci_probe() */ > > > drivers/gpu/drm/xe/xe_device.c:ALLOW_ERROR_INJECTION(wait_for_lmem_ready, ERRNO); /* See xe_pci_probe() */ > > > drivers/gpu/drm/xe/xe_exec_queue.c:ALLOW_ERROR_INJECTION(xe_exec_queue_create_bind, ERRNO); > > > drivers/gpu/drm/xe/xe_ggtt.c:ALLOW_ERROR_INJECTION(xe_ggtt_init_early, ERRNO); /* See xe_pci_probe() */ > > > drivers/gpu/drm/xe/xe_guc_ads.c:ALLOW_ERROR_INJECTION(xe_guc_ads_init, ERRNO); /* See xe_pci_probe() */ > > > drivers/gpu/drm/xe/xe_guc_ct.c:ALLOW_ERROR_INJECTION(xe_guc_ct_init, ERRNO); /* See xe_pci_probe() */ > > > drivers/gpu/drm/xe/xe_guc_log.c:ALLOW_ERROR_INJECTION(xe_guc_log_init, ERRNO); /* See xe_pci_probe() */ > > > drivers/gpu/drm/xe/xe_guc_relay.c:ALLOW_ERROR_INJECTION(xe_guc_relay_init, ERRNO); /* See xe_pci_probe() */ > > > drivers/gpu/drm/xe/xe_pci.c: * ALLOW_ERROR_INJECTION() is used to conditionally skip function execution > > > drivers/gpu/drm/xe/xe_pci.c: * ALLOW_ERROR_INJECTION() macro but this is acceptable because for those > > > drivers/gpu/drm/xe/xe_pm.c:ALLOW_ERROR_INJECTION(xe_pm_init_early, ERRNO); /* See xe_pci_probe() */ > > > drivers/gpu/drm/xe/xe_pt.c:ALLOW_ERROR_INJECTION(xe_pt_create, ERRNO); > > > drivers/gpu/drm/xe/xe_pt.c:ALLOW_ERROR_INJECTION(xe_pt_update_ops_prepare, ERRNO); > > > drivers/gpu/drm/xe/xe_pt.c:ALLOW_ERROR_INJECTION(xe_pt_update_ops_run, ERRNO); > > > drivers/gpu/drm/xe/xe_sriov.c:ALLOW_ERROR_INJECTION(xe_sriov_init, ERRNO); /* See xe_pci_probe() */ > > > drivers/gpu/drm/xe/xe_sync.c:ALLOW_ERROR_INJECTION(xe_sync_entry_parse, ERRNO); > > > drivers/gpu/drm/xe/xe_tile.c:ALLOW_ERROR_INJECTION(xe_tile_init_early, ERRNO); /* See xe_pci_probe() */ > > > drivers/gpu/drm/xe/xe_uc_fw.c:ALLOW_ERROR_INJECTION(xe_uc_fw_init, ERRNO); /* See xe_pci_probe() */ > > > drivers/gpu/drm/xe/xe_vm.c:ALLOW_ERROR_INJECTION(xe_vma_ops_alloc, ERRNO); > > > drivers/gpu/drm/xe/xe_vm.c:ALLOW_ERROR_INJECTION(xe_vm_create_scratch, ERRNO); > > > drivers/gpu/drm/xe/xe_vm.c:ALLOW_ERRORALLOW_ERROR_INJECTION_INJECTION(vm_bind_ioctl_ops_create, ERRNO); > > > drivers/gpu/drm/xe/xe_vm.c:ALLOW_ERROR_INJECTION(vm_bind_ioctl_ops_execute, ERRNO); > > > drivers/gpu/drm/xe/xe_wa.c:ALLOW_ERROR_INJECTION(xe_wa_init, ERRNO); /* See xe_pci_probe() */ > > > drivers/gpu/drm/xe/xe_wopcm.c:ALLOW_ERROR_INJECTION(xe_wopcm_init, ERRNO); /* See xe_pci_probe() */ > > > > > > That is different from the patch you are pointing to because that patch > > > is trying to add arbitrary/named error injection points throughout the > > > code. However via debugfs it's still possible to add error injection to > > > > When reading the patch I assumed the block/failure-injection.c was needed for > > the knobs in sysfs/debugfs. But I see I was wrong and these are only needed for > > the arbitrary error injection points? Correct, ALLOW_ERROR_INJECTION() is sufficient to make a function error injectable and controllable with the debugfs knobs. Arbitrary error injection points might also be added using only ALLOW_ERROR_INJECTION() and extra functions: static int should_X_fail(void) { return 0; } ALLOW_ERROR_INJECTION(should_X_fail, ERRNO); int foo(...) { ... if (should_X_fail()) { // error } ... } But this would introduce a lot of dummy code, which the block/failure-injection.c approach prevents. Until now, fault injection in xe has mostly been used to test error handling during probe and during IOCTLs calls. Errors are unconditionnally injected in each function one by one during tests. This is why probability, interval, times and space are configured to always trigger. Besides, function-level error injection points have been a good fit in the driver, hence the absence of arbitrary error injection points. Francois > > yeah. For working with ALLOW_ERROR_INJECTION() we have to refactor the > code so the functions follow its requirements. When that is true, then > we can simply use the fail_function/inject to trigger it. > > > > > I see mm/fail_page_alloc.c has a similar approach with should_fail_alloc_page(). > > > > > the beginning of a function by annotating that function with > > > ALLOW_ERROR_INJECTION. If a function is annotated with that, then if you > > > do e.g. > > > > > > echo xe_device_create > /sys/kernel/debug/fail_function/inject > > > > > > it will cause that function to fail. There are some additional files to > > > control _when_ that function should fail, but I'm failing to see a clear > > > benefit. See this example in the docs: > > > > Can you clarify if _when_ (in debugfs) allows you to access function arguments > > of a given annotated function with ALLOW_ERROR_INJECTION()? It seems that might > > be the only part that can be moved out of the kernel and handled in eBPF. Other > > than that, I don't see either a benefit of using one approach over the other. > > afaik we can't change the behavior based on arguments when using the > debugfs approach. > > Lucas De Marchi > > > > > > > > > Documentation/fault-injection/fault-injection.rst:- Inject open_ctree error while btrfs mount:: > > > > > > Lucas De Marchi > >

diff --git a/tools/bpf/Makefile b/tools/bpf/Makefile index 243b79f2b451e52ca196f79dc46befd1b3dab458..018cab5102e7e42b8b7b2749f4f463bf55c5119b 100644 --- a/tools/bpf/Makefile +++ b/tools/bpf/Makefile @@ -38,7 +38,7 @@ FEATURE_TESTS = libbfd disassembler-four-args disassembler-init-styled FEATURE_DISPLAY = libbfd check_feat := 1 -NON_CHECK_FEAT_TARGETS := clean bpftool_clean runqslower_clean resolve_btfids_clean +NON_CHECK_FEAT_TARGETS := clean bpftool_clean moderr_clean runqslower_clean resolve_btfids_clean ifdef MAKECMDGOALS ifeq ($(filter-out $(NON_CHECK_FEAT_TARGETS),$(MAKECMDGOALS)),) check_feat := 0 @@ -76,7 +76,7 @@ $(OUTPUT)%.lex.o: $(OUTPUT)%.lex.c PROGS = $(OUTPUT)bpf_jit_disasm $(OUTPUT)bpf_dbg $(OUTPUT)bpf_asm -all: $(PROGS) bpftool runqslower +all: $(PROGS) bpftool moderr runqslower $(OUTPUT)bpf_jit_disasm: CFLAGS += -DPACKAGE='bpf_jit_disasm' $(OUTPUT)bpf_jit_disasm: $(OUTPUT)bpf_jit_disasm.o @@ -92,7 +92,7 @@ $(OUTPUT)bpf_exp.lex.c: $(OUTPUT)bpf_exp.yacc.c $(OUTPUT)bpf_exp.yacc.o: $(OUTPUT)bpf_exp.yacc.c $(OUTPUT)bpf_exp.lex.o: $(OUTPUT)bpf_exp.lex.c -clean: bpftool_clean runqslower_clean resolve_btfids_clean +clean: bpftool_clean moderr_clean runqslower_clean resolve_btfids_clean $(call QUIET_CLEAN, bpf-progs) $(Q)$(RM) -r -- $(OUTPUT)*.o $(OUTPUT)bpf_jit_disasm $(OUTPUT)bpf_dbg \ $(OUTPUT)bpf_asm $(OUTPUT)bpf_exp.yacc.* $(OUTPUT)bpf_exp.lex.* @@ -118,6 +118,12 @@ bpftool_install: bpftool_clean: $(call descend,bpftool,clean) +moderr: + $(call descend,moderr) + +moderr_clean: + $(call descend,moderr,clean) + runqslower: $(call descend,runqslower) @@ -131,5 +137,6 @@ resolve_btfids_clean: $(call descend,resolve_btfids,clean) .PHONY: all install clean bpftool bpftool_install bpftool_clean \ + moderr moderr_clean \ runqslower runqslower_clean \ resolve_btfids resolve_btfids_clean diff --git a/tools/bpf/moderr/.gitignore b/tools/bpf/moderr/.gitignore new file mode 100644 index 0000000000000000000000000000000000000000..ffdb70230c8bc308efcc8b7d2084856e2225da91 --- /dev/null +++ b/tools/bpf/moderr/.gitignore @@ -0,0 +1,2 @@ +# SPDX-License-Identifier: GPL-2.0-only +/.output diff --git a/tools/bpf/moderr/Makefile b/tools/bpf/moderr/Makefile new file mode 100644 index 0000000000000000000000000000000000000000..e6331179f7800e6c1d1945ca713e34f74f7d805d --- /dev/null +++ b/tools/bpf/moderr/Makefile @@ -0,0 +1,95 @@ +# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) +include ../../scripts/Makefile.include +include ../../scripts/Makefile.arch + +OUTPUT ?= $(abspath .output)/ + +BPFTOOL_OUTPUT := $(OUTPUT)bpftool/ +DEFAULT_BPFTOOL := $(BPFTOOL_OUTPUT)bootstrap/bpftool +BPFTOOL ?= $(DEFAULT_BPFTOOL) +LIBBPF_SRC := $(abspath ../../lib/bpf) +BPFOBJ_OUTPUT := $(OUTPUT)libbpf/ +BPFOBJ := $(BPFOBJ_OUTPUT)libbpf.a +BPF_DESTDIR := $(BPFOBJ_OUTPUT) +BPF_INCLUDE := $(BPF_DESTDIR)include +INCLUDES := -I$(OUTPUT) -I$(BPF_INCLUDE) -I$(abspath ../../include/uapi) +CFLAGS := -g -Wall $(CLANG_CROSS_FLAGS) +CFLAGS += $(EXTRA_CFLAGS) +LDFLAGS += $(EXTRA_LDFLAGS) +LDLIBS += -lelf -lz + +# Try to detect best kernel BTF source +KERNEL_REL := $(shell uname -r) +VMLINUX_BTF_PATHS := $(if $(O),$(O)/vmlinux) \ + $(if $(KBUILD_OUTPUT),$(KBUILD_OUTPUT)/vmlinux) \ + ../../../vmlinux /sys/kernel/btf/vmlinux \ + /boot/vmlinux-$(KERNEL_REL) +VMLINUX_BTF_PATH := $(or $(VMLINUX_BTF),$(firstword \ + $(wildcard $(VMLINUX_BTF_PATHS)))) + +ifeq ($(V),1) +Q = +else +Q = @ +MAKEFLAGS += --no-print-directory +submake_extras := feature_display=0 +endif + +.DELETE_ON_ERROR: + +.PHONY: all clean moderr libbpf_hdrs +all: moderr + +moderr: $(OUTPUT)moderr + +clean: + $(call QUIET_CLEAN, moderr) + $(Q)$(RM) -r $(BPFOBJ_OUTPUT) $(BPFTOOL_OUTPUT) + $(Q)$(RM) $(OUTPUT)*.o $(OUTPUT)*.d + $(Q)$(RM) $(OUTPUT)*.skel.h $(OUTPUT)vmlinux.h + $(Q)$(RM) $(OUTPUT)moderr + $(Q)$(RM) -r .output + +libbpf_hdrs: $(BPFOBJ) + +$(OUTPUT)moderr: $(OUTPUT)moderr.o $(BPFOBJ) + $(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) $^ $(LDLIBS) -o $@ + +$(OUTPUT)moderr.o: moderr.h $(OUTPUT)moderr.skel.h \ + $(OUTPUT)moderr.bpf.o | libbpf_hdrs + +$(OUTPUT)moderr.bpf.o: $(OUTPUT)vmlinux.h moderr.h | libbpf_hdrs + +$(OUTPUT)%.skel.h: $(OUTPUT)%.bpf.o | $(BPFTOOL) + $(QUIET_GEN)$(BPFTOOL) gen skeleton $< > $@ + +$(OUTPUT)%.bpf.o: %.bpf.c $(BPFOBJ) | $(OUTPUT) + $(QUIET_GEN)$(CLANG) -g -O2 --target=bpf $(INCLUDES) \ + -D__TARGET_ARCH_$(SRCARCH) \ + -c $(filter %.c,$^) -o $@ && \ + $(LLVM_STRIP) -g $@ + +$(OUTPUT)%.o: %.c | $(OUTPUT) + $(QUIET_CC)$(CC) $(CFLAGS) $(INCLUDES) -c $(filter %.c,$^) -o $@ + +$(OUTPUT) $(BPFOBJ_OUTPUT) $(BPFTOOL_OUTPUT): + $(QUIET_MKDIR)mkdir -p $@ + +$(OUTPUT)vmlinux.h: $(VMLINUX_BTF_PATH) | $(OUTPUT) $(BPFTOOL) +ifeq ($(VMLINUX_H),) + $(Q)if [ ! -e "$(VMLINUX_BTF_PATH)" ] ; then \ + echo "Couldn't find kernel BTF; set VMLINUX_BTF to" \ + "specify its location." >&2; \ + exit 1;\ + fi + $(QUIET_GEN)$(BPFTOOL) btf dump file $(VMLINUX_BTF_PATH) format c > $@ +else + $(Q)cp "$(VMLINUX_H)" $@ +endif + +$(BPFOBJ): $(wildcard $(LIBBPF_SRC)/*.[ch] $(LIBBPF_SRC)/Makefile) | $(BPFOBJ_OUTPUT) + $(Q)$(MAKE) $(submake_extras) -C $(LIBBPF_SRC) OUTPUT=$(BPFOBJ_OUTPUT) \ + DESTDIR=$(BPFOBJ_OUTPUT) prefix= $(abspath $@) install_headers + +$(DEFAULT_BPFTOOL): | $(BPFTOOL_OUTPUT) + $(Q)$(MAKE) $(submake_extras) -C ../bpftool OUTPUT=$(BPFTOOL_OUTPUT) bootstrap diff --git a/tools/bpf/moderr/moderr.bpf.c b/tools/bpf/moderr/moderr.bpf.c new file mode 100644 index 0000000000000000000000000000000000000000..1c5d03336dd87a2f065ef6b608f077a8b988e5cf --- /dev/null +++ b/tools/bpf/moderr/moderr.bpf.c @@ -0,0 +1,127 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2025 Samsung */ +#include "vmlinux.h" +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_tracing.h> +#include <bpf/bpf_core_read.h> +#include "moderr.h" + +const volatile bool filter_modname = false; +const volatile char targ_modname[MODULE_NAME_LEN]; +const volatile bool set_errinj = false; +const volatile int targ_errinj = 0; +const volatile bool filter_modfunc = false; +const volatile int targ_modfunc = 0; + +char LICENSE[] SEC("license") = "GPL"; + +struct { + __uint(type, BPF_MAP_TYPE_RINGBUF); + __uint(max_entries, 2097152); +} rb SEC(".maps"); + +static __always_inline bool filter_module_name(struct module *mod) +{ + char modname[MODULE_NAME_LEN]; + + bpf_probe_read_str(modname, sizeof(modname), mod->name); + + if (!filter_modname || + filter_modname && bpf_strncmp(modname, MODULE_NAME_LEN, + (const char *)targ_modname) != 0) + return false; + + return true; +} + +static __always_inline bool filter_module_func(enum modfunc fc) +{ + if (!filter_modfunc || filter_modfunc && targ_modfunc != fc) + return false; + + return true; +} + +static __always_inline bool +generate_errinj_event(struct pt_regs *ctx, struct module *mod, enum modfunc fc) +{ + struct event *e; + + e = bpf_ringbuf_reserve(&rb, sizeof(*e), 0); + if (!e) + return false; + + e->err = 0; + e->func = fc; + bpf_probe_read_str(e->modname, sizeof(e->modname), mod->name); + + if (set_errinj) { + bpf_override_return(ctx, targ_errinj); + e->err = targ_errinj; + } + + bpf_ringbuf_submit(e, 0); + return true; +} + +static __always_inline bool generate_debug_event(struct pt_regs *ctx, + struct module *mod, + enum modfunc fc, + const char *fmt) +{ + struct event *e; + + e = bpf_ringbuf_reserve(&rb, sizeof(*e), 0); + if (!e) + return false; + + e->dbg = BPF_SNPRINTF(e->msg, sizeof(e->msg), "[%s:%s]: %s", mod->name, + modfunc_to_string(fc), fmt); + + bpf_ringbuf_submit(e, 0); + return true; +} + +static __always_inline int +module_error_injection(struct pt_regs *ctx, struct module *mod, enum modfunc fc) +{ + if (!filter_module_name(mod)) { + generate_debug_event(ctx, mod, fc, + "Target module does not match"); + return 0; + } + + if (!filter_module_func(fc)) { + generate_debug_event(ctx, mod, fc, + "Target function does not match"); + return 0; + } + + if (!generate_errinj_event(ctx, mod, fc)) { + generate_debug_event( + ctx, mod, fc, + "Error injection event cannot be generated"); + return 0; + } + + return 0; +} + +SEC("kprobe/complete_formation") +int BPF_KPROBE(complete_formation, struct module *mod, struct load_info *info) +{ + return module_error_injection(ctx, mod, COMPLETE_FORMATION); +} + +SEC("kprobe/do_init_module") +int BPF_KPROBE(do_init_module, struct module *mod, struct load_info *info) +{ + return module_error_injection(ctx, mod, DO_INIT_MODULE); +} + +SEC("kprobe/module_enable_rodata_ro_after_init") +int BPF_KPROBE(module_enable_rodata_ro_after_init, struct module *mod) +{ + return module_error_injection(ctx, mod, + MODULE_ENABLE_RODATA_AFTER_INIT); +} diff --git a/tools/bpf/moderr/moderr.c b/tools/bpf/moderr/moderr.c new file mode 100644 index 0000000000000000000000000000000000000000..dce18b02b55d1ad1f7e304cb49985d570b115aa4 --- /dev/null +++ b/tools/bpf/moderr/moderr.c @@ -0,0 +1,236 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2025 Samsung */ +#include <argp.h> +#include <signal.h> +#include <stdbool.h> +#include <stdio.h> +#include <bpf/libbpf.h> +#include <bpf/bpf.h> +#include <stdlib.h> +#include <string.h> +#include "moderr.h" +#include "moderr.skel.h" + +static struct env { + bool verbose; + char modname[MODULE_NAME_LEN]; + enum modfunc func; + bool trace; + int errinj; +} env; + +const char *argp_program_version = "moderr 0.1"; +const char *argp_program_bug_address = "<da.gomez@samsung.com>"; +const char argp_program_doc[] = +"BPF moderr application.\n" +"\n" +"It injects errors in module initialization\n" +"\nUSAGE: " +"moderr [-m <module_name>] [-f <function_name>] [-e <errno>]\n"; + +static volatile bool exiting = false; + +#define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x))) + +static const struct argp_option opts[] = { + { "verbose", 'v', NULL, 0, "Verbose debug output" }, + { "trace", 't', NULL, 0, "Enable trace output", 0 }, + { "modname", 'm', "MODNAME", 0, "Trace this module name only", 0 }, + { "modfunc", 'f', "MODFUNC", 0, "Trace this module function only", 0 }, + { "list", 'l', NULL, 0, "List available module functions", 0 }, + { "error", 'e', "ERROR", 0, "Inject this error", 0 }, + { NULL, 'h', NULL, OPTION_HIDDEN, "Show the full help", 0 }, + {}, +}; + +static void help_modfunc(void) +{ + printf("\nAvailable modfunc options are:\n" + "- complete_formation\n" + "- do_init_module\n" + "- module_enable_rodata_ro_after_init\n\n"); +} + +static enum modfunc string_to_modfunc(char *arg) +{ + if (strncmp(arg, "complete_formation", strlen(arg)) == 0) + return COMPLETE_FORMATION; + + if (strncmp(arg, "do_init_module", strlen(arg)) == 0) + return DO_INIT_MODULE; + + if (strncmp(arg, "module_enable_rodata_ro_after_init", strlen(arg)) == + 0) + return MODULE_ENABLE_RODATA_AFTER_INIT; + + return UNKNOWN; +} + +static error_t parse_arg(int key, char *arg, struct argp_state *state) +{ + switch (key) { + case 'h': + argp_state_help(state, stderr, ARGP_HELP_STD_HELP); + break; + case 'l': + help_modfunc(); + argp_usage(state); + break; + case 'v': + env.verbose = true; + break; + case 'm': + if (strlen(arg) + 1 > MODULE_NAME_LEN) { + fprintf(stderr, "module name error\n"); + argp_usage(state); + } + strncpy(env.modname, arg, sizeof(env.modname) - 1); + break; + case 'f': + if (strlen(arg) + 1 > MODULE_FUNC_LEN) { + fprintf(stderr, "module function too long\n"); + argp_usage(state); + } + env.func = string_to_modfunc(arg); + if (!env.func) { + fprintf(stderr, "invalid module function\n"); + help_modfunc(); + argp_usage(state); + } + break; + case 'e': + env.errinj = atoi(arg); + break; + case 't': + env.trace = true; + break; + case ARGP_KEY_ARG: + argp_usage(state); + break; + default: + return ARGP_ERR_UNKNOWN; + } + return 0; +} + +static const struct argp argp = { + .options = opts, + .parser = parse_arg, + .doc = argp_program_doc, +}; + +static int libbpf_print_fn(enum libbpf_print_level level, const char *format, + va_list args) +{ + if (level == LIBBPF_DEBUG && !env.verbose) + return 0; + return vfprintf(stderr, format, args); +} + +static void sig_handler(int sig) +{ + exiting = true; +} + +static int handle_event(void *ctx, void *data, size_t data_sz) +{ + const struct event *e = data; + + if (!env.trace) + return 0; + + if (e->dbg) { + if (env.verbose) + printf("%s\n", e->msg); + return 0; + } + + printf("%-10s %-5d %-20s\n", e->modname, e->err, + modfunc_to_string(e->func)); + + return 0; +} + +int main(int argc, char **argv) +{ + struct ring_buffer *rb = NULL; + struct moderr_bpf *obj; + int err; + + err = argp_parse(&argp, argc, argv, 0, NULL, NULL); + if (err) + return err; + + if (!strlen(env.modname) || !env.func) { + fprintf(stderr, "missing arguments\n"); + return EXIT_FAILURE; + } + + libbpf_set_print(libbpf_print_fn); + + signal(SIGINT, sig_handler); + signal(SIGTERM, sig_handler); + + obj = moderr_bpf__open(); + if (!obj) { + fprintf(stderr, "failed to open and load BPF object\n"); + return 1; + } + + obj->rodata->filter_modname = true; + strncpy(obj->rodata->targ_modname, env.modname, MODULE_NAME_LEN - 1); + obj->rodata->targ_modname[MODULE_NAME_LEN - 1] = '\0'; + + obj->rodata->filter_modfunc = true; + obj->rodata->targ_modfunc = env.func; + + if (env.errinj) { + obj->rodata->set_errinj = true; + obj->rodata->targ_errinj = env.errinj; + } + + err = moderr_bpf__load(obj); + if (err) { + fprintf(stderr, "failed to load and verify BPF object\n"); + goto cleanup; + } + + err = moderr_bpf__attach(obj); + if (err) { + fprintf(stderr, "failed to attach BPF object\n"); + goto cleanup; + } + + printf("Monitoring module error injection... Hit Ctrl-C to end.\n"); + + rb = ring_buffer__new(bpf_map__fd(obj->maps.rb), handle_event, NULL, + NULL); + if (!rb) { + err = -1; + fprintf(stderr, "failed to create ring buffer\n"); + goto cleanup; + } + + if (env.trace) + printf("%-10s %-5s %-20s\n", "MODULE", "ERROR", "FUNCTION"); + + while (!exiting) { + err = ring_buffer__poll(rb, 100); + if (err == -EINTR) { + err = 0; + break; + } + if (err < 0) { + fprintf(stderr, "error polling ring buffer: %d\n", err); + break; + } + } + + printf("\n"); + +cleanup: + ring_buffer__free(rb); + moderr_bpf__destroy(obj); + + return err < 0 ? -err : 0; +} diff --git a/tools/bpf/moderr/moderr.h b/tools/bpf/moderr/moderr.h new file mode 100644 index 0000000000000000000000000000000000000000..e17440cf4bd5fe09b927cb83807a88f66861bba5 --- /dev/null +++ b/tools/bpf/moderr/moderr.h @@ -0,0 +1,40 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (c) 2025 Samsung */ +#ifndef __MODERR_H +#define __MODERR_H + +#define MAX_PARAM_PREFIX_LEN (64 - sizeof(unsigned long)) +#define MODULE_NAME_LEN MAX_PARAM_PREFIX_LEN +#define MODULE_FUNC_LEN 128 +#define MESSAGE_LEN 128 + +enum modfunc { + UNKNOWN, + COMPLETE_FORMATION = 1, + DO_INIT_MODULE, + MODULE_ENABLE_RODATA_AFTER_INIT, +}; + +struct event { + char modname[MODULE_NAME_LEN]; + int err; + int func; + char msg[MESSAGE_LEN]; + int dbg; +}; + +static inline const char *modfunc_to_string(enum modfunc fc) +{ + switch (fc) { + case COMPLETE_FORMATION: + return "complete_formation()"; + case DO_INIT_MODULE: + return "do_init_module()"; + case MODULE_ENABLE_RODATA_AFTER_INIT: + return "module_enable_rodata_after_init()"; + default: + return "unknown"; + } +} + +#endif /* __MODERR_H */

[2/2] moderr: add module error injection tool

Checks

Commit Message

Comments

Patch