diff mbox series

bpf: fix sample_flags for bpf_perf_event_output

Message ID 20221007081327.1047552-1-sumanthk@linux.ibm.com (mailing list archive)
State Handled Elsewhere
Delegated to: BPF
Headers show
Series bpf: fix sample_flags for bpf_perf_event_output | expand

Checks

Context Check Description
netdev/tree_selection success Guessed tree name to be net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit fail Errors and warnings before: 15 this patch: 6
netdev/cc_maintainers fail 13 maintainers not CCed: sdf@google.com john.fastabend@gmail.com andrii@kernel.org yhs@fb.com rostedt@goodmis.org ast@kernel.org haoluo@google.com mingo@redhat.com jolsa@kernel.org kpsingh@kernel.org song@kernel.org daniel@iogearbox.net martin.lau@linux.dev
netdev/build_clang fail Errors and warnings before: 5 this patch: 7
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes fail Problems with Fixes tag: 1
netdev/build_allmodconfig_warn fail Errors and warnings before: 15 this patch: 6
netdev/checkpatch warning WARNING: Unknown commit id '838d9bb62d13', maybe rebased or not pulled?
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-PR fail PR summary
bpf/vmtest-bpf-next-VM_Test-1 pending Logs for ${{ matrix.test }} on ${{ matrix.arch }} with ${{ matrix.toolchain }}
bpf/vmtest-bpf-next-VM_Test-2 fail Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-3 fail Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-4 fail Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-5 success Logs for llvm-toolchain
bpf/vmtest-bpf-next-VM_Test-6 success Logs for set-matrix

Commit Message

Sumanth Korikkar Oct. 7, 2022, 8:13 a.m. UTC
* Raw data is also filled by bpf_perf_event_output.
* Add sample_flags to indicate raw data.
* This eliminates the segfaults as shown below:
  Run ./samples/bpf/trace_output
  BUG pid 9 cookie 1001000000004 sized 4
  BUG pid 9 cookie 1001000000004 sized 4
  BUG pid 9 cookie 1001000000004 sized 4
  Segmentation fault (core dumped)

Fixes: 838d9bb62d13 ("perf: Use sample_flags for raw_data")
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
---
 kernel/trace/bpf_trace.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Jiri Olsa Oct. 7, 2022, 9:45 a.m. UTC | #1
On Fri, Oct 07, 2022 at 10:13:27AM +0200, Sumanth Korikkar wrote:
> * Raw data is also filled by bpf_perf_event_output.
> * Add sample_flags to indicate raw data.
> * This eliminates the segfaults as shown below:
>   Run ./samples/bpf/trace_output
>   BUG pid 9 cookie 1001000000004 sized 4
>   BUG pid 9 cookie 1001000000004 sized 4
>   BUG pid 9 cookie 1001000000004 sized 4
>   Segmentation fault (core dumped)
> 
> Fixes: 838d9bb62d13 ("perf: Use sample_flags for raw_data")
> Acked-by: Namhyung Kim <namhyung@kernel.org>
> Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>

Acked-by: Jiri Olsa <jolsa@kernel.org>

Peter,
I think this should go through your tree again?
bpf-next/master does not have sample_flags merged yet

thanks,
jirka

> ---
>  kernel/trace/bpf_trace.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 49fb9ec8366d..1ed08967fb97 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -687,6 +687,7 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,
>  
>  	perf_sample_data_init(sd, 0, 0);
>  	sd->raw = &raw;
> +	sd->sample_flags |= PERF_SAMPLE_RAW;
>  
>  	err = __bpf_perf_event_output(regs, map, flags, sd);
>  
> @@ -745,6 +746,7 @@ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
>  	perf_fetch_caller_regs(regs);
>  	perf_sample_data_init(sd, 0, 0);
>  	sd->raw = &raw;
> +	sd->sample_flags |= PERF_SAMPLE_RAW;
>  
>  	ret = __bpf_perf_event_output(regs, map, flags, sd);
>  out:
> -- 
> 2.36.1
>
Peter Zijlstra Oct. 7, 2022, 3:18 p.m. UTC | #2
On Fri, Oct 07, 2022 at 11:45:36AM +0200, Jiri Olsa wrote:
> On Fri, Oct 07, 2022 at 10:13:27AM +0200, Sumanth Korikkar wrote:
> > * Raw data is also filled by bpf_perf_event_output.
> > * Add sample_flags to indicate raw data.
> > * This eliminates the segfaults as shown below:
> >   Run ./samples/bpf/trace_output
> >   BUG pid 9 cookie 1001000000004 sized 4
> >   BUG pid 9 cookie 1001000000004 sized 4
> >   BUG pid 9 cookie 1001000000004 sized 4
> >   Segmentation fault (core dumped)
> > 
> > Fixes: 838d9bb62d13 ("perf: Use sample_flags for raw_data")
> > Acked-by: Namhyung Kim <namhyung@kernel.org>
> > Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
> 
> Acked-by: Jiri Olsa <jolsa@kernel.org>
> 
> Peter,
> I think this should go through your tree again?
> bpf-next/master does not have sample_flags merged yet

Yep can do. I'll line it up in perf/urgent (Ingo just send out
perf/core).
SeongJae Park Oct. 17, 2022, 7:27 p.m. UTC | #3
Hello,


The commit that this patch is fixing[1] also causes yet another segfault for
'perf-script' of tracepoint records.  For example:

    $ sudo timeout 3 perf record -e exceptions:page_fault_user
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.228 MB perf.data (74 samples) ]
    $ sudo perf script
    Segmentation fault

Reverting this patch and the original bug commit[1] fixes the issue.  I haven't
deep dive yet because I'm not familiar with this area.  Anybody has any idea
about this?

[1] 838d9bb62d13 ("perf: Use sample_flags for raw_data")


Thanks,
SJ

On Fri, 7 Oct 2022 10:13:27 +0200 Sumanth Korikkar <sumanthk@linux.ibm.com> wrote:

> * Raw data is also filled by bpf_perf_event_output.
> * Add sample_flags to indicate raw data.
> * This eliminates the segfaults as shown below:
>   Run ./samples/bpf/trace_output
>   BUG pid 9 cookie 1001000000004 sized 4
>   BUG pid 9 cookie 1001000000004 sized 4
>   BUG pid 9 cookie 1001000000004 sized 4
>   Segmentation fault (core dumped)
> 
> Fixes: 838d9bb62d13 ("perf: Use sample_flags for raw_data")
> Acked-by: Namhyung Kim <namhyung@kernel.org>
> Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
> ---
>  kernel/trace/bpf_trace.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 49fb9ec8366d..1ed08967fb97 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -687,6 +687,7 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,
>  
>  	perf_sample_data_init(sd, 0, 0);
>  	sd->raw = &raw;
> +	sd->sample_flags |= PERF_SAMPLE_RAW;
>  
>  	err = __bpf_perf_event_output(regs, map, flags, sd);
>  
> @@ -745,6 +746,7 @@ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
>  	perf_fetch_caller_regs(regs);
>  	perf_sample_data_init(sd, 0, 0);
>  	sd->raw = &raw;
> +	sd->sample_flags |= PERF_SAMPLE_RAW;
>  
>  	ret = __bpf_perf_event_output(regs, map, flags, sd);
>  out:
> -- 
> 2.36.1
Namhyung Kim Oct. 17, 2022, 10:52 p.m. UTC | #4
Hi SeongJae,

On Mon, Oct 17, 2022 at 12:27 PM SeongJae Park <sj@kernel.org> wrote:
>
> Hello,
>
>
> The commit that this patch is fixing[1] also causes yet another segfault for
> 'perf-script' of tracepoint records.  For example:
>
>     $ sudo timeout 3 perf record -e exceptions:page_fault_user
>     [ perf record: Woken up 1 times to write data ]
>     [ perf record: Captured and wrote 0.228 MB perf.data (74 samples) ]
>     $ sudo perf script
>     Segmentation fault
>
> Reverting this patch and the original bug commit[1] fixes the issue.  I haven't
> deep dive yet because I'm not familiar with this area.  Anybody has any idea
> about this?
>
> [1] 838d9bb62d13 ("perf: Use sample_flags for raw_data")

Sorry for the trouble.  I think you also need to apply the below:

https://lore.kernel.org/r/20221012143857.48198-1-james.clark@arm.com

Thanks,
Namhyung

>
> On Fri, 7 Oct 2022 10:13:27 +0200 Sumanth Korikkar <sumanthk@linux.ibm.com> wrote:
>
> > * Raw data is also filled by bpf_perf_event_output.
> > * Add sample_flags to indicate raw data.
> > * This eliminates the segfaults as shown below:
> >   Run ./samples/bpf/trace_output
> >   BUG pid 9 cookie 1001000000004 sized 4
> >   BUG pid 9 cookie 1001000000004 sized 4
> >   BUG pid 9 cookie 1001000000004 sized 4
> >   Segmentation fault (core dumped)
> >
> > Fixes: 838d9bb62d13 ("perf: Use sample_flags for raw_data")
> > Acked-by: Namhyung Kim <namhyung@kernel.org>
> > Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
> > ---
> >  kernel/trace/bpf_trace.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index 49fb9ec8366d..1ed08967fb97 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -687,6 +687,7 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,
> >
> >       perf_sample_data_init(sd, 0, 0);
> >       sd->raw = &raw;
> > +     sd->sample_flags |= PERF_SAMPLE_RAW;
> >
> >       err = __bpf_perf_event_output(regs, map, flags, sd);
> >
> > @@ -745,6 +746,7 @@ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
> >       perf_fetch_caller_regs(regs);
> >       perf_sample_data_init(sd, 0, 0);
> >       sd->raw = &raw;
> > +     sd->sample_flags |= PERF_SAMPLE_RAW;
> >
> >       ret = __bpf_perf_event_output(regs, map, flags, sd);
> >  out:
> > --
> > 2.36.1
SeongJae Park Oct. 17, 2022, 11:35 p.m. UTC | #5
On Mon, 17 Oct 2022 15:52:15 -0700 Namhyung Kim <namhyung@kernel.org> wrote:

> Hi SeongJae,
> 
> On Mon, Oct 17, 2022 at 12:27 PM SeongJae Park <sj@kernel.org> wrote:
> >
> > Hello,
> >
> >
> > The commit that this patch is fixing[1] also causes yet another segfault for
> > 'perf-script' of tracepoint records.  For example:
> >
> >     $ sudo timeout 3 perf record -e exceptions:page_fault_user
> >     [ perf record: Woken up 1 times to write data ]
> >     [ perf record: Captured and wrote 0.228 MB perf.data (74 samples) ]
> >     $ sudo perf script
> >     Segmentation fault
> >
> > Reverting this patch and the original bug commit[1] fixes the issue.  I haven't
> > deep dive yet because I'm not familiar with this area.  Anybody has any idea
> > about this?
> >
> > [1] 838d9bb62d13 ("perf: Use sample_flags for raw_data")
> 
> Sorry for the trouble.

No problem.

> I think you also need to apply the below:
> 
> https://lore.kernel.org/r/20221012143857.48198-1-james.clark@arm.com

Thank you for this nice answer.  I confirmed that this fixes my issue.


Thanks,
SJ

[...]
Alexei Starovoitov Oct. 19, 2022, 4:57 a.m. UTC | #6
On Fri, Oct 7, 2022 at 8:31 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Fri, Oct 07, 2022 at 11:45:36AM +0200, Jiri Olsa wrote:
> > On Fri, Oct 07, 2022 at 10:13:27AM +0200, Sumanth Korikkar wrote:
> > > * Raw data is also filled by bpf_perf_event_output.
> > > * Add sample_flags to indicate raw data.
> > > * This eliminates the segfaults as shown below:
> > >   Run ./samples/bpf/trace_output
> > >   BUG pid 9 cookie 1001000000004 sized 4
> > >   BUG pid 9 cookie 1001000000004 sized 4
> > >   BUG pid 9 cookie 1001000000004 sized 4
> > >   Segmentation fault (core dumped)
> > >
> > > Fixes: 838d9bb62d13 ("perf: Use sample_flags for raw_data")
> > > Acked-by: Namhyung Kim <namhyung@kernel.org>
> > > Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
> >
> > Acked-by: Jiri Olsa <jolsa@kernel.org>
> >
> > Peter,
> > I think this should go through your tree again?
> > bpf-next/master does not have sample_flags merged yet
>
> Yep can do. I'll line it up in perf/urgent (Ingo just send out
> perf/core).

Peter,

Could you please hurry up. 11 days have passed.

This issue affects everyone the hard way now after merging
all the trees: tip -> linus -> net-next -> bpf-next.
The BPF CI is red right now with 5 tests failing because
this fix is still missing.
It's causing a headache to maintainers and developers.
Alexei Starovoitov Oct. 21, 2022, 1:36 a.m. UTC | #7
Peter,

Another 2 days have passed and bpf side is still broken
due to the change that went during the merge window without
corresponding fix from the bpf side.
Looks like the patch is sitting in tip:perf/urgent.
Please send it to Linus asap.

We're not sending bpf fixes to avoid breaking bpf tree too.
We've worked around the issue in bpf CI for bpf-next tree only.
Developers still see failures when they run tests locally.

On Tue, Oct 18, 2022 at 9:57 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Fri, Oct 7, 2022 at 8:31 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Fri, Oct 07, 2022 at 11:45:36AM +0200, Jiri Olsa wrote:
> > > On Fri, Oct 07, 2022 at 10:13:27AM +0200, Sumanth Korikkar wrote:
> > > > * Raw data is also filled by bpf_perf_event_output.
> > > > * Add sample_flags to indicate raw data.
> > > > * This eliminates the segfaults as shown below:
> > > >   Run ./samples/bpf/trace_output
> > > >   BUG pid 9 cookie 1001000000004 sized 4
> > > >   BUG pid 9 cookie 1001000000004 sized 4
> > > >   BUG pid 9 cookie 1001000000004 sized 4
> > > >   Segmentation fault (core dumped)
> > > >
> > > > Fixes: 838d9bb62d13 ("perf: Use sample_flags for raw_data")
> > > > Acked-by: Namhyung Kim <namhyung@kernel.org>
> > > > Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
> > >
> > > Acked-by: Jiri Olsa <jolsa@kernel.org>
> > >
> > > Peter,
> > > I think this should go through your tree again?
> > > bpf-next/master does not have sample_flags merged yet
> >
> > Yep can do. I'll line it up in perf/urgent (Ingo just send out
> > perf/core).
>
> Peter,
>
> Could you please hurry up. 11 days have passed.
>
> This issue affects everyone the hard way now after merging
> all the trees: tip -> linus -> net-next -> bpf-next.
> The BPF CI is red right now with 5 tests failing because
> this fix is still missing.
> It's causing a headache to maintainers and developers.
Alexei Starovoitov Oct. 23, 2022, 1:16 a.m. UTC | #8
Another 2 days have passed and the fix is still not in the Linus's tree.

Peter,
whatever your excuse is for not sending tip:perf/urgent
this is not acceptable.

Linus,

please apply this fix directly:
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=perf/urgent&id=21da7472a040420f2dc624ffec70291a72c5d6a6

or suggest the course of action.

It sucked to have such a breakage in rc1 and we don't want rc2
to stay broken.

Thanks

On Thu, Oct 20, 2022 at 6:36 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> Peter,
>
> Another 2 days have passed and bpf side is still broken
> due to the change that went during the merge window without
> corresponding fix from the bpf side.
> Looks like the patch is sitting in tip:perf/urgent.
> Please send it to Linus asap.
>
> We're not sending bpf fixes to avoid breaking bpf tree too.
> We've worked around the issue in bpf CI for bpf-next tree only.
> Developers still see failures when they run tests locally.
>
> On Tue, Oct 18, 2022 at 9:57 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Fri, Oct 7, 2022 at 8:31 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Fri, Oct 07, 2022 at 11:45:36AM +0200, Jiri Olsa wrote:
> > > > On Fri, Oct 07, 2022 at 10:13:27AM +0200, Sumanth Korikkar wrote:
> > > > > * Raw data is also filled by bpf_perf_event_output.
> > > > > * Add sample_flags to indicate raw data.
> > > > > * This eliminates the segfaults as shown below:
> > > > >   Run ./samples/bpf/trace_output
> > > > >   BUG pid 9 cookie 1001000000004 sized 4
> > > > >   BUG pid 9 cookie 1001000000004 sized 4
> > > > >   BUG pid 9 cookie 1001000000004 sized 4
> > > > >   Segmentation fault (core dumped)
> > > > >
> > > > > Fixes: 838d9bb62d13 ("perf: Use sample_flags for raw_data")
> > > > > Acked-by: Namhyung Kim <namhyung@kernel.org>
> > > > > Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
> > > >
> > > > Acked-by: Jiri Olsa <jolsa@kernel.org>
> > > >
> > > > Peter,
> > > > I think this should go through your tree again?
> > > > bpf-next/master does not have sample_flags merged yet
> > >
> > > Yep can do. I'll line it up in perf/urgent (Ingo just send out
> > > perf/core).
> >
> > Peter,
> >
> > Could you please hurry up. 11 days have passed.
> >
> > This issue affects everyone the hard way now after merging
> > all the trees: tip -> linus -> net-next -> bpf-next.
> > The BPF CI is red right now with 5 tests failing because
> > this fix is still missing.
> > It's causing a headache to maintainers and developers.
Linus Torvalds Oct. 23, 2022, 4:55 p.m. UTC | #9
On Sat, Oct 22, 2022 at 6:16 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> Linus,
>
> please apply this fix directly or suggest the course of action.

I have a pull request from Borislav with the fix that came in
overnight, so this should be all fixed in rc2.

                 Linus
Linus Torvalds Oct. 23, 2022, 5:19 p.m. UTC | #10
On Sun, Oct 23, 2022 at 9:55 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> I have a pull request from Borislav with the fix that came in
> overnight, so this should be all fixed in rc2.

.. and now it has moved from my inbox to my -git tree.

                   Linus
Alexei Starovoitov Oct. 23, 2022, 5:28 p.m. UTC | #11
On Sun, Oct 23, 2022 at 10:20 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Sun, Oct 23, 2022 at 9:55 AM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > I have a pull request from Borislav with the fix that came in
> > overnight, so this should be all fixed in rc2.
>
> .. and now it has moved from my inbox to my -git tree.

Great. Thank you.
diff mbox series

Patch

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 49fb9ec8366d..1ed08967fb97 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -687,6 +687,7 @@  BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,
 
 	perf_sample_data_init(sd, 0, 0);
 	sd->raw = &raw;
+	sd->sample_flags |= PERF_SAMPLE_RAW;
 
 	err = __bpf_perf_event_output(regs, map, flags, sd);
 
@@ -745,6 +746,7 @@  u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
 	perf_fetch_caller_regs(regs);
 	perf_sample_data_init(sd, 0, 0);
 	sd->raw = &raw;
+	sd->sample_flags |= PERF_SAMPLE_RAW;
 
 	ret = __bpf_perf_event_output(regs, map, flags, sd);
 out: