mbox series

[0/1] arm64: defconfig: Add Coresight as module

Message ID 20220921140535.152627-1-james.clark@arm.com (mailing list archive)
Headers show
Series arm64: defconfig: Add Coresight as module | expand

Message

James Clark Sept. 21, 2022, 2:05 p.m. UTC
As suggested by Catalin here's the change to add Coresight to defconfig.

Unfortunately I don't think we should add CONFIG_CORESIGHT_SOURCE_ETM4X
which builds a few files until [1] is merged because of the overhead
of CONFIG_PID_IN_CONTEXTIDR.

[1]: https://lore.kernel.org/lkml/20211021134530.206216-1-leo.yan@linaro.org/T/

James Clark (1):
  arm64: defconfig: Add Coresight as module

 arch/arm64/configs/defconfig | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

Catalin Marinas Sept. 21, 2022, 3:08 p.m. UTC | #1
On Wed, Sep 21, 2022 at 03:05:34PM +0100, James Clark wrote:
> As suggested by Catalin here's the change to add Coresight to defconfig.
> 
> Unfortunately I don't think we should add CONFIG_CORESIGHT_SOURCE_ETM4X
> which builds a few files until [1] is merged because of the overhead
> of CONFIG_PID_IN_CONTEXTIDR.
> 
> [1]: https://lore.kernel.org/lkml/20211021134530.206216-1-leo.yan@linaro.org/T/

I thought the overhead wasn't the problem, it's mostly negligible. We
can probably save a few more cycles on the __switch_to() path by
replacing several isb()s in those functions with a single one just
before cpu_switch_to().

IIRC the issue is that unless a process runs in the root pid namespace,
the actual pid written to contextidr is meaningless.

Now that you reminded me of that thread, I see three options (sorry, not
entirely related to the defconfig updates):

1. Remove CONFIG_PID_IN_CONTEXTIDR and corresponding code completely,
   find other events to correlate the task with the trace.

2. Always on CONFIG_PID_IN_CONTEXTIDR (we might as well remove the
   Kconfig entry). This would write the root pid namespace value
   (task_pid_nr()).

3. Similar to (2) but instead write task_pid_nr_ns(). An alternative
   here is to write -1 if the task is not in the root pid namespace.

Strong preference for (1).
James Clark Sept. 22, 2022, 9:34 a.m. UTC | #2
On 21/09/2022 16:08, Catalin Marinas wrote:
> On Wed, Sep 21, 2022 at 03:05:34PM +0100, James Clark wrote:
>> As suggested by Catalin here's the change to add Coresight to defconfig.
>>
>> Unfortunately I don't think we should add CONFIG_CORESIGHT_SOURCE_ETM4X
>> which builds a few files until [1] is merged because of the overhead
>> of CONFIG_PID_IN_CONTEXTIDR.
>>
>> [1]: https://lore.kernel.org/lkml/20211021134530.206216-1-leo.yan@linaro.org/T/
> 
> I thought the overhead wasn't the problem, it's mostly negligible. We
> can probably save a few more cycles on the __switch_to() path by
> replacing several isb()s in those functions with a single one just
> before cpu_switch_to().
> 
> IIRC the issue is that unless a process runs in the root pid namespace,
> the actual pid written to contextidr is meaningless.

This is true, and Leo has recently disabled it in that scenario in
aab473867fed.

> 
> Now that you reminded me of that thread, I see three options (sorry, not
> entirely related to the defconfig updates):
> 
> 1. Remove CONFIG_PID_IN_CONTEXTIDR and corresponding code completely,
>    find other events to correlate the task with the trace.

Unfortunately when tracing per core we would need kernel timestamps in
the trace to correlate to the switch records. At the moment Coresight is
using a different clock source so it's not possible and we're still
using the context ID to correlate samples.

With FEAT_TRF in v8.4 it will be possible to do this and we've started
working towards that here: 0f00b223ea22. But we'd still have to support
older hardware too, so CONFIG_PID_IN_CONTEXTIDR can't be removed completely.

For SPE it's not required because we already have the right timestamps
in the samples and we've added support for no context IDs in the decoder
here: 27d113cfe892

> 
> 2. Always on CONFIG_PID_IN_CONTEXTIDR (we might as well remove the
>    Kconfig entry). This would write the root pid namespace value
>    (task_pid_nr()).

If we're not worried about the overhead after all, this would be the
easiest solution. And then SPE or Coresight already decide whether they
want to use the value or not, so no further changes are needed.

From Leo's patch there is a table that shows a 1% overhead with it
enabled permanently, and I've heard a figure like that mentioned before.
So I could also resurrect that patch to use static keys? Although it's a
bit more complicated, that would be my preference. And then we can have
that mode always on.

> 
> 3. Similar to (2) but instead write task_pid_nr_ns(). An alternative
>    here is to write -1 if the task is not in the root pid namespace.
> 
> Strong preference for (1).
>
Catalin Marinas Sept. 22, 2022, 10:52 a.m. UTC | #3
On Thu, Sep 22, 2022 at 10:34:45AM +0100, James Clark wrote:
> On 21/09/2022 16:08, Catalin Marinas wrote:
> > 2. Always on CONFIG_PID_IN_CONTEXTIDR (we might as well remove the
> >    Kconfig entry). This would write the root pid namespace value
> >    (task_pid_nr()).
> 
> If we're not worried about the overhead after all, this would be the
> easiest solution. And then SPE or Coresight already decide whether they
> want to use the value or not, so no further changes are needed.
> 
> From Leo's patch there is a table that shows a 1% overhead with it
> enabled permanently, and I've heard a figure like that mentioned before.
> So I could also resurrect that patch to use static keys? Although it's a
> bit more complicated, that would be my preference. And then we can have
> that mode always on.

I don't think we should bother with static keys, just always enable it
but try to reduce/group the ISBs from all the functions called on the
__switch_to() path. We may actually get a speed-up.
James Clark Sept. 22, 2022, 1:06 p.m. UTC | #4
On 22/09/2022 11:52, Catalin Marinas wrote:
> On Thu, Sep 22, 2022 at 10:34:45AM +0100, James Clark wrote:
>> On 21/09/2022 16:08, Catalin Marinas wrote:
>>> 2. Always on CONFIG_PID_IN_CONTEXTIDR (we might as well remove the
>>>    Kconfig entry). This would write the root pid namespace value
>>>    (task_pid_nr()).
>>
>> If we're not worried about the overhead after all, this would be the
>> easiest solution. And then SPE or Coresight already decide whether they
>> want to use the value or not, so no further changes are needed.
>>
>> From Leo's patch there is a table that shows a 1% overhead with it
>> enabled permanently, and I've heard a figure like that mentioned before.
>> So I could also resurrect that patch to use static keys? Although it's a
>> bit more complicated, that would be my preference. And then we can have
>> that mode always on.
> 
> I don't think we should bother with static keys, just always enable it
> but try to reduce/group the ISBs from all the functions called on the
> __switch_to() path. We may actually get a speed-up.
> 

Ok thanks I will take a look at this