diff mbox series

[4/8] perf/hw_breakpoint: Make hw_breakpoint_weight() inlinable

Message ID 20220609113046.780504-5-elver@google.com (mailing list archive)
State New, archived
Headers show
Series perf/hw_breakpoint: Optimize for thousands of tasks | expand

Commit Message

Marco Elver June 9, 2022, 11:30 a.m. UTC
Due to being a __weak function, hw_breakpoint_weight() will cause the
compiler to always emit a call to it. This generates unnecessarily bad
code (register spills etc.) for no good reason; in fact it appears in
profiles of `perf bench -r 100 breakpoint thread -b 4 -p 128 -t 512`:

    ...
    0.70%  [kernel]       [k] hw_breakpoint_weight
    ...

While a small percentage, no architecture defines its own
hw_breakpoint_weight() nor are there users outside hw_breakpoint.c,
which makes the fact it is currently __weak a poor choice.

Change hw_breakpoint_weight()'s definition to follow a similar protocol
to hw_breakpoint_slots(), such that if <asm/hw_breakpoint.h> defines
hw_breakpoint_weight(), we'll use it instead.

The result is that it is inlined and no longer shows up in profiles.

Signed-off-by: Marco Elver <elver@google.com>
---
 include/linux/hw_breakpoint.h | 1 -
 kernel/events/hw_breakpoint.c | 4 +++-
 2 files changed, 3 insertions(+), 2 deletions(-)

Comments

Dmitry Vyukov June 9, 2022, 12:03 p.m. UTC | #1
On Thu, 9 Jun 2022 at 13:31, Marco Elver <elver@google.com> wrote:
>
> Due to being a __weak function, hw_breakpoint_weight() will cause the
> compiler to always emit a call to it. This generates unnecessarily bad
> code (register spills etc.) for no good reason; in fact it appears in
> profiles of `perf bench -r 100 breakpoint thread -b 4 -p 128 -t 512`:
>
>     ...
>     0.70%  [kernel]       [k] hw_breakpoint_weight
>     ...
>
> While a small percentage, no architecture defines its own
> hw_breakpoint_weight() nor are there users outside hw_breakpoint.c,
> which makes the fact it is currently __weak a poor choice.
>
> Change hw_breakpoint_weight()'s definition to follow a similar protocol
> to hw_breakpoint_slots(), such that if <asm/hw_breakpoint.h> defines
> hw_breakpoint_weight(), we'll use it instead.
>
> The result is that it is inlined and no longer shows up in profiles.
>
> Signed-off-by: Marco Elver <elver@google.com>
> ---
>  include/linux/hw_breakpoint.h | 1 -
>  kernel/events/hw_breakpoint.c | 4 +++-
>  2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/hw_breakpoint.h b/include/linux/hw_breakpoint.h
> index 78dd7035d1e5..9fa3547acd87 100644
> --- a/include/linux/hw_breakpoint.h
> +++ b/include/linux/hw_breakpoint.h
> @@ -79,7 +79,6 @@ extern int dbg_reserve_bp_slot(struct perf_event *bp);
>  extern int dbg_release_bp_slot(struct perf_event *bp);
>  extern int reserve_bp_slot(struct perf_event *bp);
>  extern void release_bp_slot(struct perf_event *bp);
> -int hw_breakpoint_weight(struct perf_event *bp);
>  int arch_reserve_bp_slot(struct perf_event *bp);
>  void arch_release_bp_slot(struct perf_event *bp);
>  void arch_unregister_hw_breakpoint(struct perf_event *bp);
> diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c
> index 8e939723f27d..5f40c8dfa042 100644
> --- a/kernel/events/hw_breakpoint.c
> +++ b/kernel/events/hw_breakpoint.c
> @@ -125,10 +125,12 @@ static __init int init_breakpoint_slots(void)
>  }
>  #endif
>
> -__weak int hw_breakpoint_weight(struct perf_event *bp)

Humm... this was added in 2010 and never actually used to return
anything other than 1 since then (?). Looks like over-design. Maybe we
drop "#ifndef" and add a comment instead?

> +#ifndef hw_breakpoint_weight
> +static inline int hw_breakpoint_weight(struct perf_event *bp)
>  {
>         return 1;
>  }
> +#endif
>
>  static inline enum bp_type_idx find_slot_idx(u64 bp_type)
>  {
> --
> 2.36.1.255.ge46751e96f-goog
>
Marco Elver June 9, 2022, 12:08 p.m. UTC | #2
On Thu, 9 Jun 2022 at 14:03, Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Thu, 9 Jun 2022 at 13:31, Marco Elver <elver@google.com> wrote:
> >
> > Due to being a __weak function, hw_breakpoint_weight() will cause the
> > compiler to always emit a call to it. This generates unnecessarily bad
> > code (register spills etc.) for no good reason; in fact it appears in
> > profiles of `perf bench -r 100 breakpoint thread -b 4 -p 128 -t 512`:
> >
> >     ...
> >     0.70%  [kernel]       [k] hw_breakpoint_weight
> >     ...
> >
> > While a small percentage, no architecture defines its own
> > hw_breakpoint_weight() nor are there users outside hw_breakpoint.c,
> > which makes the fact it is currently __weak a poor choice.
> >
> > Change hw_breakpoint_weight()'s definition to follow a similar protocol
> > to hw_breakpoint_slots(), such that if <asm/hw_breakpoint.h> defines
> > hw_breakpoint_weight(), we'll use it instead.
> >
> > The result is that it is inlined and no longer shows up in profiles.
> >
> > Signed-off-by: Marco Elver <elver@google.com>
> > ---
> >  include/linux/hw_breakpoint.h | 1 -
> >  kernel/events/hw_breakpoint.c | 4 +++-
> >  2 files changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/linux/hw_breakpoint.h b/include/linux/hw_breakpoint.h
> > index 78dd7035d1e5..9fa3547acd87 100644
> > --- a/include/linux/hw_breakpoint.h
> > +++ b/include/linux/hw_breakpoint.h
> > @@ -79,7 +79,6 @@ extern int dbg_reserve_bp_slot(struct perf_event *bp);
> >  extern int dbg_release_bp_slot(struct perf_event *bp);
> >  extern int reserve_bp_slot(struct perf_event *bp);
> >  extern void release_bp_slot(struct perf_event *bp);
> > -int hw_breakpoint_weight(struct perf_event *bp);
> >  int arch_reserve_bp_slot(struct perf_event *bp);
> >  void arch_release_bp_slot(struct perf_event *bp);
> >  void arch_unregister_hw_breakpoint(struct perf_event *bp);
> > diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c
> > index 8e939723f27d..5f40c8dfa042 100644
> > --- a/kernel/events/hw_breakpoint.c
> > +++ b/kernel/events/hw_breakpoint.c
> > @@ -125,10 +125,12 @@ static __init int init_breakpoint_slots(void)
> >  }
> >  #endif
> >
> > -__weak int hw_breakpoint_weight(struct perf_event *bp)
>
> Humm... this was added in 2010 and never actually used to return
> anything other than 1 since then (?). Looks like over-design. Maybe we
> drop "#ifndef" and add a comment instead?

Then there's little reason for the function either and we can just
directly increment/decrement 1 everywhere. If we drop the ability for
an arch to override, I feel that'd be cleaner.

Either way, codegen won't change though.

Preferences?
Dmitry Vyukov June 9, 2022, 12:23 p.m. UTC | #3
On Thu, 9 Jun 2022 at 14:08, Marco Elver <elver@google.com> wrote:
> > > Due to being a __weak function, hw_breakpoint_weight() will cause the
> > > compiler to always emit a call to it. This generates unnecessarily bad
> > > code (register spills etc.) for no good reason; in fact it appears in
> > > profiles of `perf bench -r 100 breakpoint thread -b 4 -p 128 -t 512`:
> > >
> > >     ...
> > >     0.70%  [kernel]       [k] hw_breakpoint_weight
> > >     ...
> > >
> > > While a small percentage, no architecture defines its own
> > > hw_breakpoint_weight() nor are there users outside hw_breakpoint.c,
> > > which makes the fact it is currently __weak a poor choice.
> > >
> > > Change hw_breakpoint_weight()'s definition to follow a similar protocol
> > > to hw_breakpoint_slots(), such that if <asm/hw_breakpoint.h> defines
> > > hw_breakpoint_weight(), we'll use it instead.
> > >
> > > The result is that it is inlined and no longer shows up in profiles.
> > >
> > > Signed-off-by: Marco Elver <elver@google.com>
> > > ---
> > >  include/linux/hw_breakpoint.h | 1 -
> > >  kernel/events/hw_breakpoint.c | 4 +++-
> > >  2 files changed, 3 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/include/linux/hw_breakpoint.h b/include/linux/hw_breakpoint.h
> > > index 78dd7035d1e5..9fa3547acd87 100644
> > > --- a/include/linux/hw_breakpoint.h
> > > +++ b/include/linux/hw_breakpoint.h
> > > @@ -79,7 +79,6 @@ extern int dbg_reserve_bp_slot(struct perf_event *bp);
> > >  extern int dbg_release_bp_slot(struct perf_event *bp);
> > >  extern int reserve_bp_slot(struct perf_event *bp);
> > >  extern void release_bp_slot(struct perf_event *bp);
> > > -int hw_breakpoint_weight(struct perf_event *bp);
> > >  int arch_reserve_bp_slot(struct perf_event *bp);
> > >  void arch_release_bp_slot(struct perf_event *bp);
> > >  void arch_unregister_hw_breakpoint(struct perf_event *bp);
> > > diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c
> > > index 8e939723f27d..5f40c8dfa042 100644
> > > --- a/kernel/events/hw_breakpoint.c
> > > +++ b/kernel/events/hw_breakpoint.c
> > > @@ -125,10 +125,12 @@ static __init int init_breakpoint_slots(void)
> > >  }
> > >  #endif
> > >
> > > -__weak int hw_breakpoint_weight(struct perf_event *bp)
> >
> > Humm... this was added in 2010 and never actually used to return
> > anything other than 1 since then (?). Looks like over-design. Maybe we
> > drop "#ifndef" and add a comment instead?
>
> Then there's little reason for the function either and we can just
> directly increment/decrement 1 everywhere. If we drop the ability for
> an arch to override, I feel that'd be cleaner.
>
> Either way, codegen won't change though.
>
> Preferences?

I don't have strong preferences either way.
Can also be:
#define HW_BREAKPOINT_WEIGHT 1
Peter Zijlstra June 9, 2022, 1:25 p.m. UTC | #4
On Thu, Jun 09, 2022 at 02:03:12PM +0200, Dmitry Vyukov wrote:

> > -__weak int hw_breakpoint_weight(struct perf_event *bp)
> 
> Humm... this was added in 2010 and never actually used to return
> anything other than 1 since then (?). Looks like over-design. Maybe we
> drop "#ifndef" and add a comment instead?

Frederic, you have any recollection what this was supposed to go do?
diff mbox series

Patch

diff --git a/include/linux/hw_breakpoint.h b/include/linux/hw_breakpoint.h
index 78dd7035d1e5..9fa3547acd87 100644
--- a/include/linux/hw_breakpoint.h
+++ b/include/linux/hw_breakpoint.h
@@ -79,7 +79,6 @@  extern int dbg_reserve_bp_slot(struct perf_event *bp);
 extern int dbg_release_bp_slot(struct perf_event *bp);
 extern int reserve_bp_slot(struct perf_event *bp);
 extern void release_bp_slot(struct perf_event *bp);
-int hw_breakpoint_weight(struct perf_event *bp);
 int arch_reserve_bp_slot(struct perf_event *bp);
 void arch_release_bp_slot(struct perf_event *bp);
 void arch_unregister_hw_breakpoint(struct perf_event *bp);
diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c
index 8e939723f27d..5f40c8dfa042 100644
--- a/kernel/events/hw_breakpoint.c
+++ b/kernel/events/hw_breakpoint.c
@@ -125,10 +125,12 @@  static __init int init_breakpoint_slots(void)
 }
 #endif
 
-__weak int hw_breakpoint_weight(struct perf_event *bp)
+#ifndef hw_breakpoint_weight
+static inline int hw_breakpoint_weight(struct perf_event *bp)
 {
 	return 1;
 }
+#endif
 
 static inline enum bp_type_idx find_slot_idx(u64 bp_type)
 {