diff mbox series

tracing/probes: Fix MAX_TRACE_ARGS limit handling

Message ID 20240929200939.162524-1-mikel@mikelr.com (mailing list archive)
State Superseded
Headers show
Series tracing/probes: Fix MAX_TRACE_ARGS limit handling | expand

Commit Message

Mikel Rychliski Sept. 29, 2024, 8:09 p.m. UTC
When creating a trace_probe we would set nr_args prior to truncating the
arguments to MAX_TRACE_ARGS. However, we would only initialize arguments
up to the limit.

This caused invalid memory access when attempting to set up probes with
more than 128 fetchargs.

  BUG: kernel NULL pointer dereference, address: 0000000000000020
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: Oops: 0000 [#1] PREEMPT SMP PTI
  CPU: 0 UID: 0 PID: 1769 Comm: cat Not tainted 6.11.0-rc7+ #8
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014
  RIP: 0010:__set_print_fmt+0x134/0x330

Resolve the issue by applying the MAX_TRACE_ARGS limit earlier. This
restores the prior behaviour of silently ignoring excess arguments.

Fixes: 035ba76014c0 ("tracing/probes: cleanup: Set trace_probe::nr_args at trace_probe_init")
Signed-off-by: Mikel Rychliski <mikel@mikelr.com>
---
 kernel/trace/trace_eprobe.c | 5 +++--
 kernel/trace/trace_fprobe.c | 3 ++-
 kernel/trace/trace_kprobe.c | 3 ++-
 kernel/trace/trace_uprobe.c | 3 ++-
 4 files changed, 9 insertions(+), 5 deletions(-)


base-commit: 886f3732878dc92fb0ad6d8b6740b66410d1d50a

Comments

Masami Hiramatsu (Google) Sept. 29, 2024, 11:40 p.m. UTC | #1
On Sun, 29 Sep 2024 16:09:37 -0400
Mikel Rychliski <mikel@mikelr.com> wrote:

> When creating a trace_probe we would set nr_args prior to truncating the
> arguments to MAX_TRACE_ARGS. However, we would only initialize arguments
> up to the limit.
> 
> This caused invalid memory access when attempting to set up probes with
> more than 128 fetchargs.
> 
>   BUG: kernel NULL pointer dereference, address: 0000000000000020
>   #PF: supervisor read access in kernel mode
>   #PF: error_code(0x0000) - not-present page
>   PGD 0 P4D 0
>   Oops: Oops: 0000 [#1] PREEMPT SMP PTI
>   CPU: 0 UID: 0 PID: 1769 Comm: cat Not tainted 6.11.0-rc7+ #8
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014
>   RIP: 0010:__set_print_fmt+0x134/0x330
> 
> Resolve the issue by applying the MAX_TRACE_ARGS limit earlier. This
> restores the prior behaviour of silently ignoring excess arguments.

Good catch! But this silently drop the arguments after MAX_TRACE_ARGS.
I rather like to reject such input with an error (-E2BIG) as below.
(Hmm, and I also need a new ftracetest test case for this.)

diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 39877c80d6cb..3f6654127d8c 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -2194,6 +2194,9 @@ int trace_probe_create(const char *raw_command, int (*createfn)(int, const char
 	if (!argv)
 		return -ENOMEM;
 
+	if (argc > MAX_TRACE_ARGS + 2)
+		return -E2BIG;
+
 	if (argc)
 		ret = createfn(argc, (const char **)argv);
Mikel Rychliski Sept. 30, 2024, 3:17 a.m. UTC | #2
On Sunday, September 29, 2024 7:40:18 P.M. EDT Masami Hiramatsu wrote:
> Good catch! But this silently drop the arguments after MAX_TRACE_ARGS.
> I rather like to reject such input with an error (-E2BIG) as below.
> (Hmm, and I also need a new ftracetest test case for this.)
> 
> diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
> index 39877c80d6cb..3f6654127d8c 100644
> --- a/kernel/trace/trace_probe.c
> +++ b/kernel/trace/trace_probe.c
> @@ -2194,6 +2194,9 @@ int trace_probe_create(const char *raw_command, int
> (*createfn)(int, const char if (!argv)
>                 return -ENOMEM;
> 
> +       if (argc > MAX_TRACE_ARGS + 2)
> +               return -E2BIG;
> +
>         if (argc)
>                 ret = createfn(argc, (const char **)argv);

I think the logic still needs to be cleaned up in the individual probe 
implementations (either to count consistently or remove the limit enforcement 
there), otherwise you can get an oops with something like:

	echo "f:testprobe copy_process" arg{1..127}=\$stack "\$arg*" > out
	cat out > /sys/kernel/debug/tracing/dynamic_events

BTF argument expansion results in >128 arguments, but we still attempt to 
process the excess unparsed ones.
Masami Hiramatsu (Google) Sept. 30, 2024, 3:40 p.m. UTC | #3
On Sun, 29 Sep 2024 23:17:14 -0400
Mikel Rychliski <mikel@mikelr.com> wrote:

> On Sunday, September 29, 2024 7:40:18 P.M. EDT Masami Hiramatsu wrote:
> > Good catch! But this silently drop the arguments after MAX_TRACE_ARGS.
> > I rather like to reject such input with an error (-E2BIG) as below.
> > (Hmm, and I also need a new ftracetest test case for this.)
> > 
> > diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
> > index 39877c80d6cb..3f6654127d8c 100644
> > --- a/kernel/trace/trace_probe.c
> > +++ b/kernel/trace/trace_probe.c
> > @@ -2194,6 +2194,9 @@ int trace_probe_create(const char *raw_command, int
> > (*createfn)(int, const char if (!argv)
> >                 return -ENOMEM;
> > 
> > +       if (argc > MAX_TRACE_ARGS + 2)
> > +               return -E2BIG;
> > +
> >         if (argc)
> >                 ret = createfn(argc, (const char **)argv);
> 
> I think the logic still needs to be cleaned up in the individual probe 
> implementations (either to count consistently or remove the limit enforcement 
> there), otherwise you can get an oops with something like:
> 
> 	echo "f:testprobe copy_process" arg{1..127}=\$stack "\$arg*" > out
> 	cat out > /sys/kernel/debug/tracing/dynamic_events

Ah, good catch. Yes, the "$arg*" problem is still there.

> 
> BTF argument expansion results in >128 arguments, but we still attempt to 
> process the excess unparsed ones.

OK, can you update your version to return an error from each probe?

Thank you,
diff mbox series

Patch

diff --git a/kernel/trace/trace_eprobe.c b/kernel/trace/trace_eprobe.c
index b0e0ec85912e..62c1a1fba403 100644
--- a/kernel/trace/trace_eprobe.c
+++ b/kernel/trace/trace_eprobe.c
@@ -914,7 +914,7 @@  static int __trace_eprobe_create(int argc, const char *argv[])
 
 	mutex_lock(&event_mutex);
 	event_call = find_and_get_event(sys_name, sys_event);
-	ep = alloc_event_probe(group, event, event_call, argc - 2);
+	ep = alloc_event_probe(group, event, event_call, min(argc - 2, MAX_TRACE_ARGS));
 	mutex_unlock(&event_mutex);
 
 	if (IS_ERR(ep)) {
@@ -936,8 +936,9 @@  static int __trace_eprobe_create(int argc, const char *argv[])
 		ep->filter_str = NULL;
 
 	argc -= 2; argv += 2;
+	argc = min(argc, MAX_TRACE_ARGS);
 	/* parse arguments */
-	for (i = 0; i < argc && i < MAX_TRACE_ARGS; i++) {
+	for (i = 0; i < argc; i++) {
 		trace_probe_log_set_index(i + 2);
 		ret = trace_eprobe_tp_update_arg(ep, argv, i);
 		if (ret)
diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c
index a079abd8955b..ca72fe8a04e7 100644
--- a/kernel/trace/trace_fprobe.c
+++ b/kernel/trace/trace_fprobe.c
@@ -1187,6 +1187,7 @@  static int __trace_fprobe_create(int argc, const char *argv[])
 		argc = new_argc;
 		argv = new_argv;
 	}
+	argc = min(argc, MAX_TRACE_ARGS);
 
 	ret = traceprobe_expand_dentry_args(argc, argv, &dbuf);
 	if (ret)
@@ -1203,7 +1204,7 @@  static int __trace_fprobe_create(int argc, const char *argv[])
 	}
 
 	/* parse arguments */
-	for (i = 0; i < argc && i < MAX_TRACE_ARGS; i++) {
+	for (i = 0; i < argc; i++) {
 		trace_probe_log_set_index(i + 2);
 		ctx.offset = 0;
 		ret = traceprobe_parse_probe_arg(&tf->tp, i, argv[i], &ctx);
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 61a6da808203..10d16b574db5 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1013,6 +1013,7 @@  static int __trace_kprobe_create(int argc, const char *argv[])
 		argc = new_argc;
 		argv = new_argv;
 	}
+	argc = min(argc, MAX_TRACE_ARGS);
 
 	ret = traceprobe_expand_dentry_args(argc, argv, &dbuf);
 	if (ret)
@@ -1029,7 +1030,7 @@  static int __trace_kprobe_create(int argc, const char *argv[])
 	}
 
 	/* parse arguments */
-	for (i = 0; i < argc && i < MAX_TRACE_ARGS; i++) {
+	for (i = 0; i < argc; i++) {
 		trace_probe_log_set_index(i + 2);
 		ctx.offset = 0;
 		ret = traceprobe_parse_probe_arg(&tk->tp, i, argv[i], &ctx);
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index c3df411a2684..985d23d57702 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -676,6 +676,7 @@  static int __trace_uprobe_create(int argc, const char **argv)
 
 	argc -= 2;
 	argv += 2;
+	argc = min(argc, MAX_TRACE_ARGS);
 
 	tu = alloc_trace_uprobe(group, event, argc, is_return);
 	if (IS_ERR(tu)) {
@@ -690,7 +691,7 @@  static int __trace_uprobe_create(int argc, const char **argv)
 	tu->filename = filename;
 
 	/* parse arguments */
-	for (i = 0; i < argc && i < MAX_TRACE_ARGS; i++) {
+	for (i = 0; i < argc; i++) {
 		struct traceprobe_parse_context ctx = {
 			.flags = (is_return ? TPARG_FL_RETURN : 0) | TPARG_FL_USER,
 		};