Message ID | 20230703150859.6176-1-ubizjak@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915/pmu: Use local64_try_cmpxchg in i915_pmu_event_read | expand |
On Mon, 03 Jul 2023, Uros Bizjak <ubizjak@gmail.com> wrote: > Use local64_try_cmpxchg instead of local64_cmpxchg (*ptr, old, new) == old > in i915_pmu_event_read. x86 CMPXCHG instruction returns success in ZF flag, > so this change saves a compare after cmpxchg (and related move instruction > in front of cmpxchg). > > Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg > fails. There is no need to re-read the value in the loop. > > No functional change intended. > > Cc: Jani Nikula <jani.nikula@linux.intel.com> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> > Cc: David Airlie <airlied@gmail.com> > Cc: Daniel Vetter <daniel@ffwll.ch> > Signed-off-by: Uros Bizjak <ubizjak@gmail.com> > --- > drivers/gpu/drm/i915/i915_pmu.c | 9 ++++----- > 1 file changed, 4 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c > index d35973b41186..108b675088ba 100644 > --- a/drivers/gpu/drm/i915/i915_pmu.c > +++ b/drivers/gpu/drm/i915/i915_pmu.c > @@ -696,12 +696,11 @@ static void i915_pmu_event_read(struct perf_event *event) > event->hw.state = PERF_HES_STOPPED; > return; > } > -again: > - prev = local64_read(&hwc->prev_count); > - new = __i915_pmu_event_read(event); > > - if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev) > - goto again; > + prev = local64_read(&hwc->prev_count); > + do { > + new = __i915_pmu_event_read(event); > + } while (!local64_try_cmpxchg(&hwc->prev_count, &prev, new)); You could save everyone a lot of time by actually documenting what these functions do. Assume you don't know what local64_try_cmpxchg() does, and see how many calls you have to go through to figure it out. Because the next time I encounter this code or a patch like this, I'm probably going to have to do that again. To me, the old one was more readable. The optimization is meaningless to me if it's not quantified but reduces readability. BR, Jani. > > local64_add(new - prev, &event->count); > }
On Tue, Jul 4, 2023 at 9:28 AM Jani Nikula <jani.nikula@linux.intel.com> wrote: > > On Mon, 03 Jul 2023, Uros Bizjak <ubizjak@gmail.com> wrote: > > Use local64_try_cmpxchg instead of local64_cmpxchg (*ptr, old, new) == old > > in i915_pmu_event_read. x86 CMPXCHG instruction returns success in ZF flag, > > so this change saves a compare after cmpxchg (and related move instruction > > in front of cmpxchg). > > > > Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg > > fails. There is no need to re-read the value in the loop. > > > > No functional change intended. > > > > Cc: Jani Nikula <jani.nikula@linux.intel.com> > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> > > Cc: David Airlie <airlied@gmail.com> > > Cc: Daniel Vetter <daniel@ffwll.ch> > > Signed-off-by: Uros Bizjak <ubizjak@gmail.com> > > --- > > drivers/gpu/drm/i915/i915_pmu.c | 9 ++++----- > > 1 file changed, 4 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c > > index d35973b41186..108b675088ba 100644 > > --- a/drivers/gpu/drm/i915/i915_pmu.c > > +++ b/drivers/gpu/drm/i915/i915_pmu.c > > @@ -696,12 +696,11 @@ static void i915_pmu_event_read(struct perf_event *event) > > event->hw.state = PERF_HES_STOPPED; > > return; > > } > > -again: > > - prev = local64_read(&hwc->prev_count); > > - new = __i915_pmu_event_read(event); > > > > - if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev) > > - goto again; > > + prev = local64_read(&hwc->prev_count); > > + do { > > + new = __i915_pmu_event_read(event); > > + } while (!local64_try_cmpxchg(&hwc->prev_count, &prev, new)); > > You could save everyone a lot of time by actually documenting what these > functions do. Assume you don't know what local64_try_cmpxchg() does, and > see how many calls you have to go through to figure it out. These functions are documented in Documentation/atomic_t.txt (under "RMW ops:" section), and the difference is explained in a separate section "CMPXCHG vs TRY_CMPXCGS" in the same file. Uros. > Because the next time I encounter this code or a patch like this, I'm > probably going to have to do that again. > > To me, the old one was more readable. The optimization is meaningless to > me if it's not quantified but reduces readability. > > > BR, > Jani. > > > > > > local64_add(new - prev, &event->count); > > } > > -- > Jani Nikula, Intel Open Source Graphics Center
On Tue, 04 Jul 2023, Uros Bizjak <ubizjak@gmail.com> wrote: > On Tue, Jul 4, 2023 at 9:28 AM Jani Nikula <jani.nikula@linux.intel.com> wrote: >> You could save everyone a lot of time by actually documenting what these >> functions do. Assume you don't know what local64_try_cmpxchg() does, and >> see how many calls you have to go through to figure it out. > > These functions are documented in Documentation/atomic_t.txt (under > "RMW ops:" section), and the difference is explained in a separate > section "CMPXCHG vs TRY_CMPXCGS" in the same file. Thanks, but *sigh*. No kernel-doc above the functions, not even a regular comment referencing atomic_t.txt. $ git grep local.*_try -- Documentation [nothing] BR, Jani.
On Tue, Jul 4, 2023 at 10:37 AM Jani Nikula <jani.nikula@linux.intel.com> wrote: > > On Tue, 04 Jul 2023, Uros Bizjak <ubizjak@gmail.com> wrote: > > On Tue, Jul 4, 2023 at 9:28 AM Jani Nikula <jani.nikula@linux.intel.com> wrote: > >> You could save everyone a lot of time by actually documenting what these > >> functions do. Assume you don't know what local64_try_cmpxchg() does, and > >> see how many calls you have to go through to figure it out. > > > > These functions are documented in Documentation/atomic_t.txt (under > > "RMW ops:" section), and the difference is explained in a separate > > section "CMPXCHG vs TRY_CMPXCGS" in the same file. > > Thanks, but *sigh*. > > No kernel-doc above the functions, not even a regular comment > referencing atomic_t.txt. > > $ git grep local.*_try -- Documentation > [nothing] Unfortunately, this was always the state w.r.t. local.* atomic functions. There is an effort to improve the documentation of atomics, perhaps it will be also extended to local variants. Uros.
On Mon, 03 Jul 2023, Uros Bizjak <ubizjak@gmail.com> wrote: > Use local64_try_cmpxchg instead of local64_cmpxchg (*ptr, old, new) == old > in i915_pmu_event_read. x86 CMPXCHG instruction returns success in ZF flag, > so this change saves a compare after cmpxchg (and related move instruction > in front of cmpxchg). > > Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg > fails. There is no need to re-read the value in the loop. > > No functional change intended. > > Cc: Jani Nikula <jani.nikula@linux.intel.com> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> > Cc: David Airlie <airlied@gmail.com> > Cc: Daniel Vetter <daniel@ffwll.ch> > Signed-off-by: Uros Bizjak <ubizjak@gmail.com> > --- > drivers/gpu/drm/i915/i915_pmu.c | 9 ++++----- > 1 file changed, 4 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c > index d35973b41186..108b675088ba 100644 > --- a/drivers/gpu/drm/i915/i915_pmu.c > +++ b/drivers/gpu/drm/i915/i915_pmu.c > @@ -696,12 +696,11 @@ static void i915_pmu_event_read(struct perf_event *event) > event->hw.state = PERF_HES_STOPPED; > return; > } > -again: > - prev = local64_read(&hwc->prev_count); > - new = __i915_pmu_event_read(event); > > - if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev) > - goto again; > + prev = local64_read(&hwc->prev_count); > + do { > + new = __i915_pmu_event_read(event); > + } while (!local64_try_cmpxchg(&hwc->prev_count, &prev, new)); Chased through the documentation again, and pushed to drm-intel-next. Thanks for the patch. BR, Jani. > > local64_add(new - prev, &event->count); > }
diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index d35973b41186..108b675088ba 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -696,12 +696,11 @@ static void i915_pmu_event_read(struct perf_event *event) event->hw.state = PERF_HES_STOPPED; return; } -again: - prev = local64_read(&hwc->prev_count); - new = __i915_pmu_event_read(event); - if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev) - goto again; + prev = local64_read(&hwc->prev_count); + do { + new = __i915_pmu_event_read(event); + } while (!local64_try_cmpxchg(&hwc->prev_count, &prev, new)); local64_add(new - prev, &event->count); }
Use local64_try_cmpxchg instead of local64_cmpxchg (*ptr, old, new) == old in i915_pmu_event_read. x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg). Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg fails. There is no need to re-read the value in the loop. No functional change intended. Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> --- drivers/gpu/drm/i915/i915_pmu.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)