diff mbox series

coresight: Fix crash when Perf and sysfs modes are used concurrently

Message ID 20231006131452.646721-1-james.clark@arm.com (mailing list archive)
State New, archived
Headers show
Series coresight: Fix crash when Perf and sysfs modes are used concurrently | expand

Commit Message

James Clark Oct. 6, 2023, 1:14 p.m. UTC
Partially revert the change in commit 6148652807ba ("coresight: Enable
and disable helper devices adjacent to the path") which changed the bare
call from source_ops(csdev)->enable() to coresight_enable_source() for
Perf sessions. It was missed that coresight_enable_source() is
specifically for the sysfs interface, rather than being a generic call.
This interferes with the sysfs reference counting to cause the following
crash:

  $ perf record -e cs_etm/@tmc_etr0/ -C 0 &
  $ echo 1 > /sys/bus/coresight/devices/tmc_etr0/enable_sink
  $ echo 1 > /sys/bus/coresight/devices/etm0/enable_source
  $ echo 0 > /sys/bus/coresight/devices/etm0/enable_source

  Unable to handle kernel NULL pointer dereference at virtual
  address 00000000000001d0
  Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
  ...
  Call trace:
   etm4_disable+0x54/0x150 [coresight_etm4x]
   coresight_disable_source+0x6c/0x98 [coresight]
   coresight_disable+0x74/0x1c0 [coresight]
   enable_source_store+0x88/0xa0 [coresight]
   dev_attr_store+0x20/0x40
   sysfs_kf_write+0x4c/0x68
   kernfs_fop_write_iter+0x120/0x1b8
   vfs_write+0x2dc/0x3b0
   ksys_write+0x70/0x108
   __arm64_sys_write+0x24/0x38
   invoke_syscall+0x50/0x128
   el0_svc_common.constprop.0+0x104/0x130
   do_el0_svc+0x40/0xb8
   el0_svc+0x2c/0xb8
   el0t_64_sync_handler+0xc0/0xc8
   el0t_64_sync+0x1a4/0x1a8
  Code: d53cd042 91002000 b9402a81 b8626800 (f940ead5)
  ---[ end trace 0000000000000000 ]---

This commit linked below also fixes the issue, but has unlocked updates
to the mode which could potentially race. So until we come up with a
more complete solution that takes all locking and interaction between
both modes into account, just revert back to the old behavior for Perf.

Reported-by: Junhao He <hejunhao3@huawei.com>
Closes: https://lore.kernel.org/linux-arm-kernel/20230921132904.60996-1-hejunhao3@huawei.com/
Fixes: 6148652807ba ("coresight: Enable and disable helper devices adjacent to the path")
Signed-off-by: James Clark <james.clark@arm.com>
---
 drivers/hwtracing/coresight/coresight-etm-perf.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Suzuki K Poulose Oct. 9, 2023, 12:27 p.m. UTC | #1
Junhao He,

Please could you test the patch and let us know if it resolves the
problem for you ?

On 06/10/2023 14:14, James Clark wrote:
> Partially revert the change in commit 6148652807ba ("coresight: Enable
> and disable helper devices adjacent to the path") which changed the bare
> call from source_ops(csdev)->enable() to coresight_enable_source() for
> Perf sessions. It was missed that coresight_enable_source() is
> specifically for the sysfs interface, rather than being a generic call.
> This interferes with the sysfs reference counting to cause the following
> crash:
> 
>    $ perf record -e cs_etm/@tmc_etr0/ -C 0 &
>    $ echo 1 > /sys/bus/coresight/devices/tmc_etr0/enable_sink
>    $ echo 1 > /sys/bus/coresight/devices/etm0/enable_source
>    $ echo 0 > /sys/bus/coresight/devices/etm0/enable_source
> 
>    Unable to handle kernel NULL pointer dereference at virtual
>    address 00000000000001d0
>    Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
>    ...
>    Call trace:
>     etm4_disable+0x54/0x150 [coresight_etm4x]
>     coresight_disable_source+0x6c/0x98 [coresight]
>     coresight_disable+0x74/0x1c0 [coresight]
>     enable_source_store+0x88/0xa0 [coresight]
>     dev_attr_store+0x20/0x40
>     sysfs_kf_write+0x4c/0x68
>     kernfs_fop_write_iter+0x120/0x1b8
>     vfs_write+0x2dc/0x3b0
>     ksys_write+0x70/0x108
>     __arm64_sys_write+0x24/0x38
>     invoke_syscall+0x50/0x128
>     el0_svc_common.constprop.0+0x104/0x130
>     do_el0_svc+0x40/0xb8
>     el0_svc+0x2c/0xb8
>     el0t_64_sync_handler+0xc0/0xc8
>     el0t_64_sync+0x1a4/0x1a8
>    Code: d53cd042 91002000 b9402a81 b8626800 (f940ead5)
>    ---[ end trace 0000000000000000 ]---
> 
> This commit linked below also fixes the issue, but has unlocked updates
> to the mode which could potentially race. So until we come up with a
> more complete solution that takes all locking and interaction between
> both modes into account, just revert back to the old behavior for Perf.
> 
> Reported-by: Junhao He <hejunhao3@huawei.com>
> Closes: https://lore.kernel.org/linux-arm-kernel/20230921132904.60996-1-hejunhao3@huawei.com/
> Fixes: 6148652807ba ("coresight: Enable and disable helper devices adjacent to the path")
> Signed-off-by: James Clark <james.clark@arm.com>

The patch looks good to me. I will wait for Junhao to test this before
pulling it in.

Suzuki




> ---
>   drivers/hwtracing/coresight/coresight-etm-perf.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
> index 5ca6278baff4..89e8ed214ea4 100644
> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> @@ -493,7 +493,7 @@ static void etm_event_start(struct perf_event *event, int flags)
>   		goto fail_end_stop;
>   
>   	/* Finally enable the tracer */
> -	if (coresight_enable_source(csdev, CS_MODE_PERF, event))
> +	if (source_ops(csdev)->enable(csdev, event, CS_MODE_PERF))
>   		goto fail_disable_path;
>   
>   	/*
> @@ -587,7 +587,7 @@ static void etm_event_stop(struct perf_event *event, int mode)
>   		return;
>   
>   	/* stop tracer */
> -	coresight_disable_source(csdev, event);
> +	source_ops(csdev)->disable(csdev, event);
>   
>   	/* tell the core */
>   	event->hw.state = PERF_HES_STOPPED;
Junhao He Oct. 10, 2023, 2:43 a.m. UTC | #2
On 2023/10/9 20:27, Suzuki K Poulose wrote:
> Junhao He,
>
> Please could you test the patch and let us know if it resolves the
> problem for you ?

Hi James and Suzuki ,

I tested the patch on my platform and it worked well.

>
> On 06/10/2023 14:14, James Clark wrote:
>> Partially revert the change in commit 6148652807ba ("coresight: Enable
>> and disable helper devices adjacent to the path") which changed the bare
>> call from source_ops(csdev)->enable() to coresight_enable_source() for
>> Perf sessions. It was missed that coresight_enable_source() is
>> specifically for the sysfs interface, rather than being a generic call.
>> This interferes with the sysfs reference counting to cause the following
>> crash:
>>
>>    $ perf record -e cs_etm/@tmc_etr0/ -C 0 &
>>    $ echo 1 > /sys/bus/coresight/devices/tmc_etr0/enable_sink
>>    $ echo 1 > /sys/bus/coresight/devices/etm0/enable_source
>>    $ echo 0 > /sys/bus/coresight/devices/etm0/enable_source
>>
>>    Unable to handle kernel NULL pointer dereference at virtual
>>    address 00000000000001d0
>>    Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
>>    ...
>>    Call trace:
>>     etm4_disable+0x54/0x150 [coresight_etm4x]
>>     coresight_disable_source+0x6c/0x98 [coresight]
>>     coresight_disable+0x74/0x1c0 [coresight]
>>     enable_source_store+0x88/0xa0 [coresight]
>>     dev_attr_store+0x20/0x40
>>     sysfs_kf_write+0x4c/0x68
>>     kernfs_fop_write_iter+0x120/0x1b8
>>     vfs_write+0x2dc/0x3b0
>>     ksys_write+0x70/0x108
>>     __arm64_sys_write+0x24/0x38
>>     invoke_syscall+0x50/0x128
>>     el0_svc_common.constprop.0+0x104/0x130
>>     do_el0_svc+0x40/0xb8
>>     el0_svc+0x2c/0xb8
>>     el0t_64_sync_handler+0xc0/0xc8
>>     el0t_64_sync+0x1a4/0x1a8
>>    Code: d53cd042 91002000 b9402a81 b8626800 (f940ead5)
>>    ---[ end trace 0000000000000000 ]---
>>
>> This commit linked below also fixes the issue, but has unlocked updates
>> to the mode which could potentially race. So until we come up with a
>> more complete solution that takes all locking and interaction between
>> both modes into account, just revert back to the old behavior for Perf.
>>
>> Reported-by: Junhao He <hejunhao3@huawei.com>
>> Closes: 
>> https://lore.kernel.org/linux-arm-kernel/20230921132904.60996-1-hejunhao3@huawei.com/
>> Fixes: 6148652807ba ("coresight: Enable and disable helper devices 
>> adjacent to the path")
>> Signed-off-by: James Clark <james.clark@arm.com>
>
> The patch looks good to me. I will wait for Junhao to test this before
> pulling it in.
>
> Suzuki
>

Tested-by: Junhao He <hejunhao3@huawei.com>

Thanks.
Junhao.

>
>
>
>> ---
>>   drivers/hwtracing/coresight/coresight-etm-perf.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c 
>> b/drivers/hwtracing/coresight/coresight-etm-perf.c
>> index 5ca6278baff4..89e8ed214ea4 100644
>> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
>> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
>> @@ -493,7 +493,7 @@ static void etm_event_start(struct perf_event 
>> *event, int flags)
>>           goto fail_end_stop;
>>         /* Finally enable the tracer */
>> -    if (coresight_enable_source(csdev, CS_MODE_PERF, event))
>> +    if (source_ops(csdev)->enable(csdev, event, CS_MODE_PERF))
>>           goto fail_disable_path;
>>         /*
>> @@ -587,7 +587,7 @@ static void etm_event_stop(struct perf_event 
>> *event, int mode)
>>           return;
>>         /* stop tracer */
>> -    coresight_disable_source(csdev, event);
>> +    source_ops(csdev)->disable(csdev, event);
>>         /* tell the core */
>>       event->hw.state = PERF_HES_STOPPED;
>
> _______________________________________________
> CoreSight mailing list -- coresight@lists.linaro.org
> To unsubscribe send an email to coresight-leave@lists.linaro.org
>
> .
>
Suzuki K Poulose Oct. 16, 2023, 11:05 p.m. UTC | #3
On Fri, 6 Oct 2023 14:14:52 +0100, James Clark wrote:
> Partially revert the change in commit 6148652807ba ("coresight: Enable
> and disable helper devices adjacent to the path") which changed the bare
> call from source_ops(csdev)->enable() to coresight_enable_source() for
> Perf sessions. It was missed that coresight_enable_source() is
> specifically for the sysfs interface, rather than being a generic call.
> This interferes with the sysfs reference counting to cause the following
> crash:
> 
> [...]

Applied, thanks!

[1/1] coresight: Fix crash when Perf and sysfs modes are used concurrently
      https://git.kernel.org/coresight/c/078dbba3f0c9

Best regards,
diff mbox series

Patch

diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index 5ca6278baff4..89e8ed214ea4 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -493,7 +493,7 @@  static void etm_event_start(struct perf_event *event, int flags)
 		goto fail_end_stop;
 
 	/* Finally enable the tracer */
-	if (coresight_enable_source(csdev, CS_MODE_PERF, event))
+	if (source_ops(csdev)->enable(csdev, event, CS_MODE_PERF))
 		goto fail_disable_path;
 
 	/*
@@ -587,7 +587,7 @@  static void etm_event_stop(struct perf_event *event, int mode)
 		return;
 
 	/* stop tracer */
-	coresight_disable_source(csdev, event);
+	source_ops(csdev)->disable(csdev, event);
 
 	/* tell the core */
 	event->hw.state = PERF_HES_STOPPED;