diff mbox series

tracing/kprobes: Check whether get_kretprobe() returns NULL in kretprobe_dispatcher()

Message ID 165366693881.797669.16926184644089588731.stgit@devnote2 (mailing list archive)
State Not Applicable
Delegated to: BPF
Headers show
Series tracing/kprobes: Check whether get_kretprobe() returns NULL in kretprobe_dispatcher() | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-3 success Logs for Kernel LATEST on z15 with gcc
bpf/vmtest-bpf-next-VM_Test-1 success Logs for Kernel LATEST on ubuntu-latest with gcc
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Kernel LATEST on ubuntu-latest with llvm-15
netdev/tree_selection success Not a local patch

Commit Message

Masami Hiramatsu (Google) May 27, 2022, 3:55 p.m. UTC
From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

There is a small chance that get_kretprobe(ri) returns NULL in
kretprobe_dispatcher() when another CPU unregisters the kretprobe
right after __kretprobe_trampoline_handler().

To avoid this issue, kretprobe_dispatcher() checks the get_kretprobe()
return value again. And if it is NULL, it returns soon because that
kretprobe is under unregistering process.

This issue has been introduced when the kretprobe is decoupled
from the struct kretprobe_instance by commit d741bf41d7c7
("kprobes: Remove kretprobe hash"). Before that commit, the
struct kretprob_instance::rp directly points the kretprobe
and it is never be NULL.

Reported-by: Yonghong Song <yhs@fb.com>
Fixes: d741bf41d7c7 ("kprobes: Remove kretprobe hash")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 kernel/trace/trace_kprobe.c |   11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

Comments

Song Liu May 30, 2022, 5:19 a.m. UTC | #1
> On May 27, 2022, at 8:55 AM, Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:
> 
> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 
> There is a small chance that get_kretprobe(ri) returns NULL in
> kretprobe_dispatcher() when another CPU unregisters the kretprobe
> right after __kretprobe_trampoline_handler().
> 
> To avoid this issue, kretprobe_dispatcher() checks the get_kretprobe()
> return value again. And if it is NULL, it returns soon because that
> kretprobe is under unregistering process.
> 
> This issue has been introduced when the kretprobe is decoupled
> from the struct kretprobe_instance by commit d741bf41d7c7
> ("kprobes: Remove kretprobe hash"). Before that commit, the
> struct kretprob_instance::rp directly points the kretprobe
> and it is never be NULL.
> 
> Reported-by: Yonghong Song <yhs@fb.com>
> Fixes: d741bf41d7c7 ("kprobes: Remove kretprobe hash")
> Cc: stable@vger.kernel.org
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Acked-by: Song Liu <song@kernel.org>

[...]
Daniel Borkmann May 30, 2022, 7:33 p.m. UTC | #2
On 5/27/22 5:55 PM, Masami Hiramatsu (Google) wrote:
> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 
> There is a small chance that get_kretprobe(ri) returns NULL in
> kretprobe_dispatcher() when another CPU unregisters the kretprobe
> right after __kretprobe_trampoline_handler().
> 
> To avoid this issue, kretprobe_dispatcher() checks the get_kretprobe()
> return value again. And if it is NULL, it returns soon because that
> kretprobe is under unregistering process.
> 
> This issue has been introduced when the kretprobe is decoupled
> from the struct kretprobe_instance by commit d741bf41d7c7
> ("kprobes: Remove kretprobe hash"). Before that commit, the
> struct kretprob_instance::rp directly points the kretprobe
> and it is never be NULL.
> 
> Reported-by: Yonghong Song <yhs@fb.com>
> Fixes: d741bf41d7c7 ("kprobes: Remove kretprobe hash")
> Cc: stable@vger.kernel.org
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Steven, I presume you'll pick this fix up?

Thanks,
Daniel
Steven Rostedt May 30, 2022, 10 p.m. UTC | #3
On May 30, 2022 9:33:23 PM GMT+02:00, Daniel Borkmann <daniel@iogearbox.net> wrote:
>On 5/27/22 5:55 PM, Masami Hiramatsu (Google) wrote:
>> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>> 
>> There is a small chance that get_kretprobe(ri) returns NULL in
>> kretprobe_dispatcher() when another CPU unregisters the kretprobe
>> right after __kretprobe_trampoline_handler().
>> 
>> To avoid this issue, kretprobe_dispatcher() checks the get_kretprobe()
>> return value again. And if it is NULL, it returns soon because that
>> kretprobe is under unregistering process.
>> 
>> This issue has been introduced when the kretprobe is decoupled
>> from the struct kretprobe_instance by commit d741bf41d7c7
>> ("kprobes: Remove kretprobe hash"). Before that commit, the
>> struct kretprob_instance::rp directly points the kretprobe
>> and it is never be NULL.
>> 
>> Reported-by: Yonghong Song <yhs@fb.com>
>> Fixes: d741bf41d7c7 ("kprobes: Remove kretprobe hash")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
>Steven, I presume you'll pick this fix up?

I'm currently at Embedded/Kernel Recipes, but yeah, I'll take a look at it. (Just need to finish my slides first ;-)

-- Steve
>
>Thanks,
>Daniel
Daniel Borkmann June 8, 2022, 12:38 p.m. UTC | #4
On 5/31/22 12:00 AM, Steven Rostedt wrote:
> On May 30, 2022 9:33:23 PM GMT+02:00, Daniel Borkmann <daniel@iogearbox.net> wrote:
>> On 5/27/22 5:55 PM, Masami Hiramatsu (Google) wrote:
>>> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>>>
>>> There is a small chance that get_kretprobe(ri) returns NULL in
>>> kretprobe_dispatcher() when another CPU unregisters the kretprobe
>>> right after __kretprobe_trampoline_handler().
>>>
>>> To avoid this issue, kretprobe_dispatcher() checks the get_kretprobe()
>>> return value again. And if it is NULL, it returns soon because that
>>> kretprobe is under unregistering process.
>>>
>>> This issue has been introduced when the kretprobe is decoupled
>>> from the struct kretprobe_instance by commit d741bf41d7c7
>>> ("kprobes: Remove kretprobe hash"). Before that commit, the
>>> struct kretprob_instance::rp directly points the kretprobe
>>> and it is never be NULL.
>>>
>>> Reported-by: Yonghong Song <yhs@fb.com>
>>> Fixes: d741bf41d7c7 ("kprobes: Remove kretprobe hash")
>>> Cc: stable@vger.kernel.org
>>> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>>
>> Steven, I presume you'll pick this fix up?
> 
> I'm currently at Embedded/Kernel Recipes, but yeah, I'll take a look at it. (Just need to finish my slides first ;-)

Ok, thanks. If I don't hear back I presume you'll pick it up then.
Steven Rostedt June 8, 2022, 1:10 p.m. UTC | #5
On Wed, 8 Jun 2022 14:38:39 +0200
Daniel Borkmann <daniel@iogearbox.net> wrote:

> >>> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>  
> >>
> >> Steven, I presume you'll pick this fix up?  
> > 
> > I'm currently at Embedded/Kernel Recipes, but yeah, I'll take a look at it. (Just need to finish my slides first ;-)  
> 
> Ok, thanks. If I don't hear back I presume you'll pick it up then.

Yeah, I'm way behind due to the conference. And I'll be on PTO from
tomorrow and back on Tuesday. And registration for Linux Plumbers is
supposed to open today (but of course there's issues with that!), thus, I'm
really have too much on my plate today :-p

-- Steve
Alexei Starovoitov June 8, 2022, 1:57 p.m. UTC | #6
On Wed, Jun 8, 2022 at 6:36 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Wed, 8 Jun 2022 14:38:39 +0200
> Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> > >>> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > >>
> > >> Steven, I presume you'll pick this fix up?
> > >
> > > I'm currently at Embedded/Kernel Recipes, but yeah, I'll take a look at it. (Just need to finish my slides first ;-)
> >
> > Ok, thanks. If I don't hear back I presume you'll pick it up then.
>
> Yeah, I'm way behind due to the conference. And I'll be on PTO from
> tomorrow and back on Tuesday. And registration for Linux Plumbers is
> supposed to open today (but of course there's issues with that!), thus, I'm
> really have too much on my plate today :-p

Steven,

you missed Davem's presentation at KR about importance of delegation.
You need watch it:
https://youtu.be/ELPENQrtUas?t=9085
Steven Rostedt June 8, 2022, 1:59 p.m. UTC | #7
On Wed, 8 Jun 2022 06:57:00 -0700
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> you missed Davem's presentation at KR about importance of delegation.
> You need watch it:
> https://youtu.be/ELPENQrtUas?t=9085

It's on my TODO list.

-- Steve
Daniel Borkmann June 17, 2022, 8:26 a.m. UTC | #8
On 6/8/22 3:10 PM, Steven Rostedt wrote:
> On Wed, 8 Jun 2022 14:38:39 +0200
> Daniel Borkmann <daniel@iogearbox.net> wrote:
> 
>>>>> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>>>>
>>>> Steven, I presume you'll pick this fix up?
>>>
>>> I'm currently at Embedded/Kernel Recipes, but yeah, I'll take a look at it. (Just need to finish my slides first ;-)
>>
>> Ok, thanks. If I don't hear back I presume you'll pick it up then.
> 
> Yeah, I'm way behind due to the conference. And I'll be on PTO from
> tomorrow and back on Tuesday. And registration for Linux Plumbers is
> supposed to open today (but of course there's issues with that!), thus, I'm
> really have too much on my plate today :-p

Steven, we still have this in our patchwork for tracking so it doesn't fall
off the radar. The patch is 3 weeks old by now. Has this been picked up yet,
or do you want to Ack and we ship the fix via bpf tree? Just asking as I
didn't see any further updates ever since.

Thanks,
Daniel
Daniel Borkmann June 17, 2022, 8:32 a.m. UTC | #9
On 6/17/22 10:26 AM, Daniel Borkmann wrote:
> On 6/8/22 3:10 PM, Steven Rostedt wrote:
>> On Wed, 8 Jun 2022 14:38:39 +0200
>> Daniel Borkmann <daniel@iogearbox.net> wrote:
>>
>>>>>> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>>>>>
>>>>> Steven, I presume you'll pick this fix up?
>>>>
>>>> I'm currently at Embedded/Kernel Recipes, but yeah, I'll take a look at it. (Just need to finish my slides first ;-)
>>>
>>> Ok, thanks. If I don't hear back I presume you'll pick it up then.
>>
>> Yeah, I'm way behind due to the conference. And I'll be on PTO from
>> tomorrow and back on Tuesday. And registration for Linux Plumbers is
>> supposed to open today (but of course there's issues with that!), thus, I'm
>> really have too much on my plate today :-p
> 
> Steven, we still have this in our patchwork for tracking so it doesn't fall
> off the radar. The patch is 3 weeks old by now. Has this been picked up yet,
> or do you want to Ack and we ship the fix via bpf tree? Just asking as I
> didn't see any further updates ever since.

Same goes for these two fixes waiting for Ack:

   https://lore.kernel.org/bpf/165461825202.280167.12903689442217921817.stgit@devnote2/

Thanks,
Daniel
Jiri Olsa June 17, 2022, 10:21 a.m. UTC | #10
On Sat, May 28, 2022 at 12:55:39AM +0900, Masami Hiramatsu (Google) wrote:
> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 
> There is a small chance that get_kretprobe(ri) returns NULL in
> kretprobe_dispatcher() when another CPU unregisters the kretprobe
> right after __kretprobe_trampoline_handler().
> 
> To avoid this issue, kretprobe_dispatcher() checks the get_kretprobe()
> return value again. And if it is NULL, it returns soon because that
> kretprobe is under unregistering process.
> 
> This issue has been introduced when the kretprobe is decoupled
> from the struct kretprobe_instance by commit d741bf41d7c7
> ("kprobes: Remove kretprobe hash"). Before that commit, the
> struct kretprob_instance::rp directly points the kretprobe
> and it is never be NULL.
> 
> Reported-by: Yonghong Song <yhs@fb.com>
> Fixes: d741bf41d7c7 ("kprobes: Remove kretprobe hash")
> Cc: stable@vger.kernel.org
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Acked-by: Jiri Olsa <jolsa@kernel.org>

jirka

> ---
>  kernel/trace/trace_kprobe.c |   11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index 93507330462c..a245ea673715 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -1718,8 +1718,17 @@ static int
>  kretprobe_dispatcher(struct kretprobe_instance *ri, struct pt_regs *regs)
>  {
>  	struct kretprobe *rp = get_kretprobe(ri);
> -	struct trace_kprobe *tk = container_of(rp, struct trace_kprobe, rp);
> +	struct trace_kprobe *tk;
> +
> +	/*
> +	 * There is a small chance that get_kretprobe(ri) returns NULL when
> +	 * the kretprobe is unregister on another CPU between kretprobe's
> +	 * trampoline_handler and this function.
> +	 */
> +	if (unlikely(!rp))
> +		return 0;
>  
> +	tk = container_of(rp, struct trace_kprobe, rp);
>  	raw_cpu_inc(*tk->nhit);
>  
>  	if (trace_probe_test_flag(&tk->tp, TP_FLAG_TRACE))
>
Steven Rostedt June 17, 2022, 4:02 p.m. UTC | #11
On Fri, 17 Jun 2022 10:26:40 +0200
Daniel Borkmann <daniel@iogearbox.net> wrote:

> Steven, we still have this in our patchwork for tracking so it doesn't fall
> off the radar. The patch is 3 weeks old by now. Has this been picked up yet,
> or do you want to Ack and we ship the fix via bpf tree? Just asking as I
> didn't see any further updates ever since.

Sorry, between traveling for conferences and PTO I fell behind. I'll pull
this into my urgent queue and start running my tests on it.

-- Steve
Daniel Borkmann June 17, 2022, 4:09 p.m. UTC | #12
On 6/17/22 6:02 PM, Steven Rostedt wrote:
> On Fri, 17 Jun 2022 10:26:40 +0200
> Daniel Borkmann <daniel@iogearbox.net> wrote:
> 
>> Steven, we still have this in our patchwork for tracking so it doesn't fall
>> off the radar. The patch is 3 weeks old by now. Has this been picked up yet,
>> or do you want to Ack and we ship the fix via bpf tree? Just asking as I
>> didn't see any further updates ever since.
> 
> Sorry, between traveling for conferences and PTO I fell behind. I'll pull
> this into my urgent queue and start running my tests on it.

Okay, if you pick these fixes up today then I'll toss them from our bpf patchwork.
Steven Rostedt June 17, 2022, 5:48 p.m. UTC | #13
On Fri, 17 Jun 2022 18:09:44 +0200
Daniel Borkmann <daniel@iogearbox.net> wrote:

> > Sorry, between traveling for conferences and PTO I fell behind. I'll pull
> > this into my urgent queue and start running my tests on it.  
> 
> Okay, if you pick these fixes up today then I'll toss them from our bpf patchwork.

This patch I'll take, but please keep the fprobe/rethook one, as that's
specific to your tree.

-- Steve
Daniel Borkmann June 17, 2022, 7:59 p.m. UTC | #14
On 6/17/22 7:48 PM, Steven Rostedt wrote:
> On Fri, 17 Jun 2022 18:09:44 +0200
> Daniel Borkmann <daniel@iogearbox.net> wrote:
> 
>>> Sorry, between traveling for conferences and PTO I fell behind. I'll pull
>>> this into my urgent queue and start running my tests on it.
>>
>> Okay, if you pick these fixes up today then I'll toss them from our bpf patchwork.
> 
> This patch I'll take, but please keep the fprobe/rethook one, as that's
> specific to your tree.

Ok, sounds good to me. I just applied them with the comment suggestion added.
diff mbox series

Patch

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 93507330462c..a245ea673715 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1718,8 +1718,17 @@  static int
 kretprobe_dispatcher(struct kretprobe_instance *ri, struct pt_regs *regs)
 {
 	struct kretprobe *rp = get_kretprobe(ri);
-	struct trace_kprobe *tk = container_of(rp, struct trace_kprobe, rp);
+	struct trace_kprobe *tk;
+
+	/*
+	 * There is a small chance that get_kretprobe(ri) returns NULL when
+	 * the kretprobe is unregister on another CPU between kretprobe's
+	 * trampoline_handler and this function.
+	 */
+	if (unlikely(!rp))
+		return 0;
 
+	tk = container_of(rp, struct trace_kprobe, rp);
 	raw_cpu_inc(*tk->nhit);
 
 	if (trace_probe_test_flag(&tk->tp, TP_FLAG_TRACE))