diff mbox series

libbpf: Remove kprobe_event on failed kprobe_open_legacy

Message ID 20220614084930.43276-1-nashuiliang@gmail.com (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series libbpf: Remove kprobe_event on failed kprobe_open_legacy | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-1 success Logs for Kernel LATEST on ubuntu-latest with gcc
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Kernel LATEST on ubuntu-latest with llvm-15
bpf/vmtest-bpf-next-VM_Test-3 success Logs for Kernel LATEST on z15 with gcc

Commit Message

Chuang Wang June 14, 2022, 8:49 a.m. UTC
In a scenario where livepatch and aggrprobe coexist, the creating
kprobe_event using tracefs API will succeed, a trace event (e.g.
/debugfs/tracing/events/kprobe/XX) will exist, but perf_event_open()
will return an error.

Signed-off-by: Chuang W <nashuiliang@gmail.com>
Signed-off-by: Jingren Zhou <zhoujingren@didiglobal.com>
---
 tools/lib/bpf/libbpf.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

Comments

John Fastabend June 18, 2022, 4:13 a.m. UTC | #1
Chuang W wrote:
> In a scenario where livepatch and aggrprobe coexist, the creating
> kprobe_event using tracefs API will succeed, a trace event (e.g.
> /debugfs/tracing/events/kprobe/XX) will exist, but perf_event_open()
> will return an error.

This seems a bit strange from API side. I'm not really familiar with
livepatch, but I guess this is UAPI now so fixing add_kprobe_event_legacy
to fail is not an option?

> 
> Signed-off-by: Chuang W <nashuiliang@gmail.com>
> Signed-off-by: Jingren Zhou <zhoujingren@didiglobal.com>
> ---
>  tools/lib/bpf/libbpf.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 0781fae58a06..d0a36350e22a 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -10809,10 +10809,11 @@ static int perf_event_kprobe_open_legacy(const char *probe_name, bool retprobe,
>  	}
>  	type = determine_kprobe_perf_type_legacy(probe_name, retprobe);
>  	if (type < 0) {
> +		err = type;
>  		pr_warn("failed to determine legacy kprobe event id for '%s+0x%zx': %s\n",
>  			kfunc_name, offset,
> -			libbpf_strerror_r(type, errmsg, sizeof(errmsg)));
> -		return type;
> +			libbpf_strerror_r(err, errmsg, sizeof(errmsg)));
> +		goto clear_kprobe_event;
>  	}
>  	attr.size = sizeof(attr);
>  	attr.config = type;
> @@ -10826,9 +10827,14 @@ static int perf_event_kprobe_open_legacy(const char *probe_name, bool retprobe,
>  		err = -errno;
>  		pr_warn("legacy kprobe perf_event_open() failed: %s\n",
>  			libbpf_strerror_r(err, errmsg, sizeof(errmsg)));
> -		return err;
> +		goto clear_kprobe_event;
>  	}
>  	return pfd;
> +
> +clear_kprobe_event:
> +	/* Clear the newly added kprobe_event */
> +	remove_kprobe_event_legacy(probe_name, retprobe);
> +	return err;
>  }
>  
>  struct bpf_link *
> -- 
> 2.34.1
>
Chuang Wang June 18, 2022, 5:31 a.m. UTC | #2
Hi John,

On Sat, Jun 18, 2022 at 12:13 PM John Fastabend
<john.fastabend@gmail.com> wrote:
>
> Chuang W wrote:
> > In a scenario where livepatch and aggrprobe coexist, the creating
> > kprobe_event using tracefs API will succeed, a trace event (e.g.
> > /debugfs/tracing/events/kprobe/XX) will exist, but perf_event_open()
> > will return an error.
>
> This seems a bit strange from API side. I'm not really familiar with
> livepatch, but I guess this is UAPI now so fixing add_kprobe_event_legacy
> to fail is not an option?
>

The legacy kprobe API (i.e. tracefs API) has two steps:

1) register_kprobe
$ echo 'p:mykprobe XXX' > /sys/kernel/debug/tracing/kprobe_events
This will create a trace event of mykprobe and register a disable
kprobe that waits to be activated.

2) enable_kprobe
2.1) using syscall perf_event_open
as the following code, perf_event_kprobe_open_legacy (file:
tools/lib/bpf/libbpf.c):
---
attr.type = PERF_TYPE_TRACEPOINT;
pfd = syscall(__NR_perf_event_open, &attr,
              pid < 0 ? -1 : pid, /* pid */
              pid == -1 ? 0 : -1, /* cpu */
              -1 /* group_fd */,  PERF_FLAG_FD_CLOEXEC);
---
In the implementation code of perf_event_open, enable_kprobe() will be executed.
2.2) using shell
$ echo 1 > /sys/kernel/debug/tracing/events/kprobes/mykprobe/enable
As with perf_event_open, enable_kprobe() will also be executed.

When using the same function XXX, kprobe and livepatch cannot coexist,
that is, step 2) will return an error (ref: arm_kprobe_ftrace()),
however, step 1) is ok!
However, the new kprobe API (i.e. perf kprobe API) aggregates
register_kprobe and enable_kprobe, internally fixes the issue on
failed enable_kprobe.
But above all, for the legacy kprobe API, I think it should remove
kprobe_event on failed add_kprobe_event_legacy() in
perf_event_kprobe_open_legacy (file: tools/lib/bpf/libbpf.c).
Jiri Olsa June 18, 2022, 9:17 p.m. UTC | #3
On Sat, Jun 18, 2022 at 01:31:01PM +0800, chuang wrote:
> Hi John,
> 
> On Sat, Jun 18, 2022 at 12:13 PM John Fastabend
> <john.fastabend@gmail.com> wrote:
> >
> > Chuang W wrote:
> > > In a scenario where livepatch and aggrprobe coexist, the creating
> > > kprobe_event using tracefs API will succeed, a trace event (e.g.
> > > /debugfs/tracing/events/kprobe/XX) will exist, but perf_event_open()
> > > will return an error.
> >
> > This seems a bit strange from API side. I'm not really familiar with
> > livepatch, but I guess this is UAPI now so fixing add_kprobe_event_legacy
> > to fail is not an option?
> >
> 
> The legacy kprobe API (i.e. tracefs API) has two steps:
> 
> 1) register_kprobe
> $ echo 'p:mykprobe XXX' > /sys/kernel/debug/tracing/kprobe_events
> This will create a trace event of mykprobe and register a disable
> kprobe that waits to be activated.
> 
> 2) enable_kprobe
> 2.1) using syscall perf_event_open
> as the following code, perf_event_kprobe_open_legacy (file:
> tools/lib/bpf/libbpf.c):
> ---
> attr.type = PERF_TYPE_TRACEPOINT;
> pfd = syscall(__NR_perf_event_open, &attr,
>               pid < 0 ? -1 : pid, /* pid */
>               pid == -1 ? 0 : -1, /* cpu */
>               -1 /* group_fd */,  PERF_FLAG_FD_CLOEXEC);
> ---
> In the implementation code of perf_event_open, enable_kprobe() will be executed.
> 2.2) using shell
> $ echo 1 > /sys/kernel/debug/tracing/events/kprobes/mykprobe/enable
> As with perf_event_open, enable_kprobe() will also be executed.
> 
> When using the same function XXX, kprobe and livepatch cannot coexist,
> that is, step 2) will return an error (ref: arm_kprobe_ftrace()),

just curious.. is that because of ipmodify flag on ftrace_ops?
AFAICS that be a poblem just for kretprobes, cc-ing Masami

thanks,
jirka


> however, step 1) is ok!
> However, the new kprobe API (i.e. perf kprobe API) aggregates
> register_kprobe and enable_kprobe, internally fixes the issue on
> failed enable_kprobe.
> But above all, for the legacy kprobe API, I think it should remove
> kprobe_event on failed add_kprobe_event_legacy() in
> perf_event_kprobe_open_legacy (file: tools/lib/bpf/libbpf.c).
Chuang Wang June 19, 2022, 1:56 a.m. UTC | #4
Hi Jiri,

On Sun, Jun 19, 2022 at 5:17 AM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> just curious.. is that because of ipmodify flag on ftrace_ops?
> AFAICS that be a poblem just for kretprobes, cc-ing Masami
>

Yes, the core reason is caused by ipmodify flag (not only for kretprobes).
Before commit 0bc11ed5ab60 ("kprobes: Allow kprobes coexist with
livepatch"), it's very easy to trigger this problem.
The kprobe has other problems and is communicating with Masami.

With this fix, whenever an error is returned after
add_kprobe_event_legacy(), this guarantees cleanup of the kprobe
event.
Daniel Borkmann June 20, 2022, 12:51 p.m. UTC | #5
On 6/19/22 3:56 AM, chuang wrote:
> On Sun, Jun 19, 2022 at 5:17 AM Jiri Olsa <olsajiri@gmail.com> wrote:
>>
>> just curious.. is that because of ipmodify flag on ftrace_ops?
>> AFAICS that be a poblem just for kretprobes, cc-ing Masami
> 
> Yes, the core reason is caused by ipmodify flag (not only for kretprobes).
> Before commit 0bc11ed5ab60 ("kprobes: Allow kprobes coexist with
> livepatch"), it's very easy to trigger this problem.
> The kprobe has other problems and is communicating with Masami.
> 
> With this fix, whenever an error is returned after
> add_kprobe_event_legacy(), this guarantees cleanup of the kprobe
> event.

The details from this follow-up conversation should definitely be part of
the commit description, please add them to a v2 as otherwise context is
missing if we look at the commit again in say few months from now. Also,
would be good if Masami could provide ack given 0bc11ed5ab60.

Thanks,
Daniel
diff mbox series

Patch

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 0781fae58a06..d0a36350e22a 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -10809,10 +10809,11 @@  static int perf_event_kprobe_open_legacy(const char *probe_name, bool retprobe,
 	}
 	type = determine_kprobe_perf_type_legacy(probe_name, retprobe);
 	if (type < 0) {
+		err = type;
 		pr_warn("failed to determine legacy kprobe event id for '%s+0x%zx': %s\n",
 			kfunc_name, offset,
-			libbpf_strerror_r(type, errmsg, sizeof(errmsg)));
-		return type;
+			libbpf_strerror_r(err, errmsg, sizeof(errmsg)));
+		goto clear_kprobe_event;
 	}
 	attr.size = sizeof(attr);
 	attr.config = type;
@@ -10826,9 +10827,14 @@  static int perf_event_kprobe_open_legacy(const char *probe_name, bool retprobe,
 		err = -errno;
 		pr_warn("legacy kprobe perf_event_open() failed: %s\n",
 			libbpf_strerror_r(err, errmsg, sizeof(errmsg)));
-		return err;
+		goto clear_kprobe_event;
 	}
 	return pfd;
+
+clear_kprobe_event:
+	/* Clear the newly added kprobe_event */
+	remove_kprobe_event_legacy(probe_name, retprobe);
+	return err;
 }
 
 struct bpf_link *