mbox series

[v3,0/2] perf: arm64: Kernel support for Dwarf unwinding through SVE functions

Message ID 20220901132658.1024635-1-james.clark@arm.com (mailing list archive)
Headers show
Series perf: arm64: Kernel support for Dwarf unwinding through SVE functions | expand

Message

James Clark Sept. 1, 2022, 1:26 p.m. UTC
Hi,

I'm resubmitting this with a few of the changes suggested by Will on V2.

I haven't made any changes regarding the open questions about the
discoverability or saving the new reg and passing to output_sample()
because I think it's best to be consistent with the implementations on
other platforms first. I have explained in more detail on v2 [1].

[1]: https://lore.kernel.org/lkml/5fcf1a6f-c8fb-c296-992e-18aae8874095@arm.com/

=======

Changes since v2:

  * Add definition for PERF_REG_EXTENDED_MASK which is needed for
    PERF_PMU_CAP_EXTENDED_REGS to work properly

  * Simplify changes to enum perf_event_arm_regs

Changes since v1:

  * Add Mark's review tag
  * Clarify in docs that it's the SVE register length
  * Split patchset into kernel side and Perf tool changes

=======

When SVE registers are pushed onto the stack the VG register is required to
unwind because the stack offsets would vary by the SVE register width at the
time when the sample was taken.

These first two patches add support for sampling the VG register to the kernel
and the docs. There is another patchset to add support to userspace perf.

A small change is also required to libunwind or libdw depending on which
unwinder is used, and these will be published later. Without these changes Perf
continues to work with both libraries, although the VG register is still not
used for unwinding. 

Thanks
James

James Clark (2):
  perf: arm64: Add SVE vector granule register to user regs
  arm64/sve: Add Perf extensions documentation

 Documentation/arm64/sve.rst             | 20 +++++++++++++++++
 arch/arm64/include/uapi/asm/perf_regs.h |  7 ++++++
 arch/arm64/kernel/perf_regs.c           | 30 +++++++++++++++++++++++--
 drivers/perf/arm_pmu.c                  |  2 +-
 4 files changed, 56 insertions(+), 3 deletions(-)

Comments

Will Deacon Sept. 22, 2022, 2:04 p.m. UTC | #1
On Thu, Sep 01, 2022 at 02:26:56PM +0100, James Clark wrote:
> I'm resubmitting this with a few of the changes suggested by Will on V2.
> 
> I haven't made any changes regarding the open questions about the
> discoverability or saving the new reg and passing to output_sample()
> because I think it's best to be consistent with the implementations on
> other platforms first. I have explained in more detail on v2 [1].
> 
> [1]: https://lore.kernel.org/lkml/5fcf1a6f-c8fb-c296-992e-18aae8874095@arm.com/

Fair enough, I can't argue against being consistent.

Given that this exposes subtle new user ABI, do we have any coverage in
the selftests? If not, please could you add something?

Thanks,

Will
James Clark Sept. 22, 2022, 2:31 p.m. UTC | #2
On 22/09/2022 15:04, Will Deacon wrote:
> On Thu, Sep 01, 2022 at 02:26:56PM +0100, James Clark wrote:
>> I'm resubmitting this with a few of the changes suggested by Will on V2.
>>
>> I haven't made any changes regarding the open questions about the
>> discoverability or saving the new reg and passing to output_sample()
>> because I think it's best to be consistent with the implementations on
>> other platforms first. I have explained in more detail on v2 [1].
>>
>> [1]: https://lore.kernel.org/lkml/5fcf1a6f-c8fb-c296-992e-18aae8874095@arm.com/
> 
> Fair enough, I can't argue against being consistent.
> 
> Given that this exposes subtle new user ABI, do we have any coverage in
> the selftests? If not, please could you add something?
> 

Thanks, I will do that. I assume you mean the self tests in
tools/perf/tests and not some non Perf tests?

> Thanks,
> 
> Will
Will Deacon Sept. 22, 2022, 8:33 p.m. UTC | #3
On Thu, 1 Sep 2022 14:26:56 +0100, James Clark wrote:
> I'm resubmitting this with a few of the changes suggested by Will on V2.
> 
> I haven't made any changes regarding the open questions about the
> discoverability or saving the new reg and passing to output_sample()
> because I think it's best to be consistent with the implementations on
> other platforms first. I have explained in more detail on v2 [1].
> 
> [...]

Applied to will (for-next/perf), thanks!

[1/2] perf: arm64: Add SVE vector granule register to user regs
      https://git.kernel.org/will/c/cbb0c02caf4b
[2/2] arm64/sve: Add Perf extensions documentation
      https://git.kernel.org/will/c/1f2906d1e10a

Cheers,
Will Deacon Sept. 22, 2022, 8:57 p.m. UTC | #4
On Thu, Sep 22, 2022 at 03:31:20PM +0100, James Clark wrote:
> 
> 
> On 22/09/2022 15:04, Will Deacon wrote:
> > On Thu, Sep 01, 2022 at 02:26:56PM +0100, James Clark wrote:
> >> I'm resubmitting this with a few of the changes suggested by Will on V2.
> >>
> >> I haven't made any changes regarding the open questions about the
> >> discoverability or saving the new reg and passing to output_sample()
> >> because I think it's best to be consistent with the implementations on
> >> other platforms first. I have explained in more detail on v2 [1].
> >>
> >> [1]: https://lore.kernel.org/lkml/5fcf1a6f-c8fb-c296-992e-18aae8874095@arm.com/
> > 
> > Fair enough, I can't argue against being consistent.
> > 
> > Given that this exposes subtle new user ABI, do we have any coverage in
> > the selftests? If not, please could you add something?
> > 
> 
> Thanks, I will do that. I assume you mean the self tests in
> tools/perf/tests and not some non Perf tests?

I hadn't thought much about it, so wherever is best. It would just be nice
to have something we can run to make sure that this continues to work as
intended.

Will
James Clark Sept. 23, 2022, 9:32 a.m. UTC | #5
On 22/09/2022 21:33, Will Deacon wrote:
> On Thu, 1 Sep 2022 14:26:56 +0100, James Clark wrote:
>> I'm resubmitting this with a few of the changes suggested by Will on V2.
>>
>> I haven't made any changes regarding the open questions about the
>> discoverability or saving the new reg and passing to output_sample()
>> because I think it's best to be consistent with the implementations on
>> other platforms first. I have explained in more detail on v2 [1].
>>
>> [...]
> 
> Applied to will (for-next/perf), thanks!
> 
> [1/2] perf: arm64: Add SVE vector granule register to user regs
>       https://git.kernel.org/will/c/cbb0c02caf4b
> [2/2] arm64/sve: Add Perf extensions documentation
>       https://git.kernel.org/will/c/1f2906d1e10a
> 
> Cheers,

Thanks Will. Sorry about the build, I will fix my config for next time.
Will Deacon Sept. 23, 2022, 12:36 p.m. UTC | #6
On Fri, Sep 23, 2022 at 10:32:15AM +0100, James Clark wrote:
> 
> 
> On 22/09/2022 21:33, Will Deacon wrote:
> > On Thu, 1 Sep 2022 14:26:56 +0100, James Clark wrote:
> >> I'm resubmitting this with a few of the changes suggested by Will on V2.
> >>
> >> I haven't made any changes regarding the open questions about the
> >> discoverability or saving the new reg and passing to output_sample()
> >> because I think it's best to be consistent with the implementations on
> >> other platforms first. I have explained in more detail on v2 [1].
> >>
> >> [...]
> > 
> > Applied to will (for-next/perf), thanks!
> > 
> > [1/2] perf: arm64: Add SVE vector granule register to user regs
> >       https://git.kernel.org/will/c/cbb0c02caf4b
> > [2/2] arm64/sve: Add Perf extensions documentation
> >       https://git.kernel.org/will/c/1f2906d1e10a
> > 
> > Cheers,
> 
> Thanks Will. Sorry about the build, I will fix my config for next time.

No problem. For some reason, I was unable to repro the failure locally.
Maybe it's a GCC thing?

Will
James Clark Sept. 23, 2022, 12:43 p.m. UTC | #7
On 23/09/2022 13:36, Will Deacon wrote:
> On Fri, Sep 23, 2022 at 10:32:15AM +0100, James Clark wrote:
>>
>>
>> On 22/09/2022 21:33, Will Deacon wrote:
>>> On Thu, 1 Sep 2022 14:26:56 +0100, James Clark wrote:
>>>> I'm resubmitting this with a few of the changes suggested by Will on V2.
>>>>
>>>> I haven't made any changes regarding the open questions about the
>>>> discoverability or saving the new reg and passing to output_sample()
>>>> because I think it's best to be consistent with the implementations on
>>>> other platforms first. I have explained in more detail on v2 [1].
>>>>
>>>> [...]
>>>
>>> Applied to will (for-next/perf), thanks!
>>>
>>> [1/2] perf: arm64: Add SVE vector granule register to user regs
>>>       https://git.kernel.org/will/c/cbb0c02caf4b
>>> [2/2] arm64/sve: Add Perf extensions documentation
>>>       https://git.kernel.org/will/c/1f2906d1e10a
>>>
>>> Cheers,
>>
>> Thanks Will. Sorry about the build, I will fix my config for next time.
> 
> No problem. For some reason, I was unable to repro the failure locally.
> Maybe it's a GCC thing?

For me I needed CONFIG_HEADERS_INSTALL and CONFIG_UAPI_HEADER_TEST to
reproduce it. I was already using gcc, so not sure if it's depends on
that or not.

> 
> Will