Message ID | 20241008183501.1354695-1-lucas.demarchi@intel.com (mailing list archive) |
---|---|
Headers | show |
Series | perf: Fix pmu for drivers with bind/unbind | expand |
On Tue, Oct 08, 2024 at 01:34:56PM -0500, Lucas De Marchi wrote: >v2 of my attempt at fixing how i915 interacts with perf events. > >v1 - https://lore.kernel.org/all/20240722210648.80892-1-lucas.demarchi@intel.com/ > >From other people: >1) https://lore.kernel.org/lkml/20240115170120.662220-1-tvrtko.ursulin@linux.intel.com/ >2) https://lore.kernel.org/all/20240213180302.47266-1-umesh.nerlige.ramappa@intel.com/ > >WARNING: patches 1, 4 and 5 are NOT intended to be applied as is. More >on this below. > >This series basically builds on the idea of the first patch of my >previous series, but extends it in a way that > > a) the other patches are not needed (at least, not as is) and > b) driver can rebind just fine - no issues with the new call to > perf_pmu_register() I have 2 broad questions: (1) I am curious how (b) works. You seem to have a notion of instances of devices and then are you using the instance number to create the name used for the sysfs entry? Did I get that right? If so, should the application discover what the name is each time it is run? In the failure case that I am seeing, I am running an application that does not work when I rename the sysfs entry to something else. (2) Similar to Patch 1 of your v1 series where you modified _free_event: static void _free_event(struct perf_event *event) { struct module *module; ... module = event->pmu->module; ... if (event->destroy) event->destroy(event); ... module_put(module); ... } With the above code, the kref to i915->pmu can be taken from the below points in i915 code (just to point out the sequence): i915_pmu_register() { perf_pmu_register() drm_dev_get() kref_init() } i915_pmu_unregister() { kref_put(&ref, pmu_cleanup) } i915_pmu_event_init() { kref_get() } i915_pmu_event_destroy() { kref_put(&ref, pmu_cleanup) } void pmu_cleanup(struct kref *ref) { i915_pmu_unregister_cpuhp_state(pmu); perf_pmu_unregister(&pmu->base); pmu->base.event_init = NULL; kfree(pmu->base.attr_groups); if (!is_igp(i915)) kfree(pmu->name); free_event_attributes(pmu); drm_dev_put(&i915->drm); } Would this work? Do you see any gaps that may need the ref counting code you added in perf? Thanks, Umesh > >Another difference is that rather than mixing i915 cleanups this just >adds a dummy pmu with no backing HW. Intention for dummy_pmu is for >experimental purpose and to demonstrate changes tha need to be applied >to i915 (and probably amdgpu, and also in the pending xe patch). >If desired to have an example like that in tree, then we should hide it >behind a config option and I'd need to re-check the error handling. > >With this set I could run the following test script multiple times with >no issues observed: > > #!/bin/bash > > set -e > > rand_sleep() { > sleep $(bc -l <<< "$(shuf -i 0-3000 -n 1) / 1000") > } > > test_noperf() { > echo 0 > /sys/kernel/debug/dummy_pmu/bind > > echo 0 > /sys/kernel/debug/dummy_pmu/unbind > } > > test_perf_before() { > echo 0 > /sys/kernel/debug/dummy_pmu/bind > > perf stat --interval-count 2 -e dummy_pmu_0/test-event-1/ -I500 > echo 0 > /sys/kernel/debug/dummy_pmu/unbind > } > > test_kill_perf_later() { > echo 0 > /sys/kernel/debug/dummy_pmu/bind > > perf stat -e dummy_pmu_0/test-event-1/ -I500 & > pid=$! > rand_sleep > echo 0 > /sys/kernel/debug/dummy_pmu/unbind > rand_sleep > kill $pid > } > > test_kill_perf_laaaaaaater() { > echo 0 > /sys/kernel/debug/dummy_pmu/bind > > perf stat -e dummy_pmu_0/test-event-1/ -I500 & > pid=$! > rand_sleep > echo 0 > /sys/kernel/debug/dummy_pmu/unbind > rand_sleep > rand_sleep > rand_sleep > kill $pid > } > > test_kill_perf_with_leader() { > echo 0 > /sys/kernel/debug/dummy_pmu/bind > > perf stat -e '{dummy_pmu_0/test-event-1/,dummy_pmu_0/test-event-2/}:S' -I500 & > pid=$! > rand_sleep > echo 0 > /sys/kernel/debug/dummy_pmu/unbind > rand_sleep > rand_sleep > kill $pid > } > > N=${1:-1} > > for ((i=0; i<N; i++)); do > printf "%04u/%04u\n" "$((i+1))" "$N" >&2 > test_noperf > test_perf_before > test_kill_perf_later > test_kill_perf_laaaaaaater > test_kill_perf_with_leader > echo >&2 > done > >Last patch is optional for a driver and not needed for the fix. > >Open topics: > > - If something like the last patch is desirable, should it be > done from inside perf_pmu_unregister()? > > - Should we have a dummy_pmu (or whatever name) behind a config, > or maybe in Documentation/ ? > >thanks, >Lucas De Marchi > >Lucas De Marchi (5): > perf: Add dummy pmu module > perf: Move free outside of the mutex > perf: Add pmu get/put > perf/dummy_pmu: Tie pmu to device lifecycle > perf/dummy_pmu: Track and disable active events > > include/linux/perf_event.h | 12 + > kernel/events/Makefile | 1 + > kernel/events/core.c | 39 ++- > kernel/events/dummy_pmu.c | 492 +++++++++++++++++++++++++++++++++++++ > 4 files changed, 539 insertions(+), 5 deletions(-) > create mode 100644 kernel/events/dummy_pmu.c > >-- >2.46.2 >
On Tue, Oct 08, 2024 at 01:34:56PM -0500, Lucas De Marchi wrote: >v2 of my attempt at fixing how i915 interacts with perf events. > >v1 - https://lore.kernel.org/all/20240722210648.80892-1-lucas.demarchi@intel.com/ > >From other people: >1) https://lore.kernel.org/lkml/20240115170120.662220-1-tvrtko.ursulin@linux.intel.com/ >2) https://lore.kernel.org/all/20240213180302.47266-1-umesh.nerlige.ramappa@intel.com/ > >WARNING: patches 1, 4 and 5 are NOT intended to be applied as is. More >on this below. I also took the patches 2 and 3, that are the ones needed, and applied the i915 changes on top. I sent only to i915 mailing list since I didn't want to pollute the mailing list with resubmissions of the same patches over and over. These fixes also worked for i915. See https://lore.kernel.org/intel-gfx/20241011225430.1219345-1-lucas.demarchi@intel.com/ if interested. Thanks Lucas De Marchi
On Fri, Oct 11, 2024 at 03:21:18PM -0700, Umesh Nerlige Ramappa wrote: >On Tue, Oct 08, 2024 at 01:34:56PM -0500, Lucas De Marchi wrote: >>v2 of my attempt at fixing how i915 interacts with perf events. >> >>v1 - https://lore.kernel.org/all/20240722210648.80892-1-lucas.demarchi@intel.com/ >> >>From other people: >>1) https://lore.kernel.org/lkml/20240115170120.662220-1-tvrtko.ursulin@linux.intel.com/ >>2) https://lore.kernel.org/all/20240213180302.47266-1-umesh.nerlige.ramappa@intel.com/ >> >>WARNING: patches 1, 4 and 5 are NOT intended to be applied as is. More >>on this below. >> >>This series basically builds on the idea of the first patch of my >>previous series, but extends it in a way that >> >> a) the other patches are not needed (at least, not as is) and >> b) driver can rebind just fine - no issues with the new call to >> perf_pmu_register() > >I have 2 broad questions: > >(1) I am curious how (b) works. You seem to have a notion of instances >of devices and then are you using the instance number to create the >name used for the sysfs entry? Did I get that right? humn... no. We just unregister the driver from pmu, so the name becomes free for when the driver rebinds with the same event name. > >If so, should the application discover what the name is each time it >is run? In the failure case that I am seeing, I am running an >application that does not work when I rename the sysfs entry to >something else. > >(2) Similar to Patch 1 of your v1 series where you modified _free_event: > >static void _free_event(struct perf_event *event) >{ > struct module *module; >... > module = event->pmu->module; >... > if (event->destroy) > event->destroy(event); >... > module_put(module); >... >} > >With the above code, the kref to i915->pmu can be taken from the below >points in i915 code (just to point out the sequence): > >i915_pmu_register() >{ > perf_pmu_register() > drm_dev_get() > kref_init() >} > >i915_pmu_unregister() >{ > kref_put(&ref, pmu_cleanup) >} > >i915_pmu_event_init() >{ > kref_get() >} > >i915_pmu_event_destroy() >{ > kref_put(&ref, pmu_cleanup) >} > >void pmu_cleanup(struct kref *ref) >{ > i915_pmu_unregister_cpuhp_state(pmu); > perf_pmu_unregister(&pmu->base); > pmu->base.event_init = NULL; > kfree(pmu->base.attr_groups); > if (!is_igp(i915)) > kfree(pmu->name); > free_event_attributes(pmu); > drm_dev_put(&i915->drm); >} > >Would this work? Do you see any gaps that may need the ref counting >code you added in perf? well... I just posted the fixes for i915 on top of these patches :) You may want to look at https://lore.kernel.org/intel-gfx/20241011225430.1219345-1-lucas.demarchi@intel.com/ It works for me on my machine with a DG2. Lucas De Marchi