mbox series

[00/19] coresight: Support for ETMv4.4 system instructions

Message ID 20200911084119.1080694-1-suzuki.poulose@arm.com (mailing list archive)
Headers show
Series coresight: Support for ETMv4.4 system instructions | expand

Message

Suzuki K Poulose Sept. 11, 2020, 8:41 a.m. UTC
CoreSight ETMv4.4 introduced system instructions for accessing
the ETM. This also implies that they may not be on the amba bus.
Right now all the CoreSight components are accessed via memory
map. Also, we have some common routines in coresight generic
code driver (e.g, CS_LOCK, claim/disclaim), which assume the
mmio. In order to preserve the generic algorithms at a single
place and to allow dynamic switch for ETMs, this series introduces
an abstraction layer for accessing a coresight device. It is
designed such that the mmio access are fast tracked (i.e, without
an indirect function call).

This will also help us to get rid of the driver+attribute specific
sysfs show/store routines and replace them with a single routine
to access a given register offset (which can be embedded in the
dev_ext_attribute). This is not currently implemented in the series,
but can be achieved.

Further we switch the generic routines to work with the abstraction.
With this in place, we refactor the etm4x code a bit to allow for
supporting the system instructions with very little new code. The
changes also switch to using the system instructions by default
even when we may have an MMIO.

We use TRCDEVARCH for the detection of the ETM component, which
is a standard register as per CoreSight architecture, rather than
the etm specific id register TRCIDR3. This is for making sure
that we are able to detect the ETM via system instructions accurately,
when the the trace unit could be anything (etm or a custom trace unit).

The series has been mildly tested on a model. I would really
appreciate any testing on real hardware.

Applies on coresight/next

Changes since V1:
  - Flip the switch for iomem from no_iomem to io_mem in csdev_access.
  - Split patches for claim/disclaim and CS_LOCK/UNLOCK conversions.
  - Move device access initialisation for etm4x to the target CPU
  - Cleanup secure exception level mask handling.
  - Switch to use TRCDEVARCH for ETM component discovery. This
    is for making 
  - Check the availability of OS/Software Locks before using them.
  

Suzuki K Poulose (19):
  coresight: Introduce device access abstraction
  coresight: tpiu: Prepare for using coresight device access abstraction
  coresight: Convert coresight_timeout to use access abstraction
  coresight: Convert claim/disclaim operations to use access wrappers
  coresight: Use device access layer for Software lock/unlock operations
  coresight: etm4x: Always read the registers on the host CPU
  coresight: etm4x: Convert all register accesses
  coresight: etm4x: Add commentary on the registers
  coresight: etm4x: Add sysreg access helpers
  coresight: etm4x: Define DEVARCH register fields
  coresight: etm4x: Check for OS and Software Lock
  coresight: etm4x: Cleanup secure exception level masks
  coresight: etm4x: Clean up exception level masks
  coresight: etm4x: Detect access early on the target CPU
  coresight: etm4x: Use TRCDEVARCH for component discovery
  coresight: etm4x: Detect system instructions support
  coresight: etm4x: Refactor probing routine
  coresight: etm4x: Add support for sysreg only devices
  dts: bindings: coresight: ETMv4.4 system register access only units

 .../devicetree/bindings/arm/coresight.txt     |   6 +-
 drivers/hwtracing/coresight/coresight-catu.c  |  24 +-
 .../hwtracing/coresight/coresight-cpu-debug.c |  22 +-
 .../hwtracing/coresight/coresight-cti-sysfs.c |   5 +-
 drivers/hwtracing/coresight/coresight-cti.c   |  34 +-
 drivers/hwtracing/coresight/coresight-etb10.c |  29 +-
 .../coresight/coresight-etm3x-sysfs.c         |  10 +-
 drivers/hwtracing/coresight/coresight-etm3x.c |  35 +-
 .../coresight/coresight-etm4x-sysfs.c         |  44 +-
 drivers/hwtracing/coresight/coresight-etm4x.c | 716 +++++++++++-------
 drivers/hwtracing/coresight/coresight-etm4x.h | 440 ++++++++++-
 .../hwtracing/coresight/coresight-funnel.c    |  22 +-
 drivers/hwtracing/coresight/coresight-priv.h  |   9 +-
 .../coresight/coresight-replicator.c          |  31 +-
 drivers/hwtracing/coresight/coresight-stm.c   |  50 +-
 .../hwtracing/coresight/coresight-tmc-etf.c   |  38 +-
 .../hwtracing/coresight/coresight-tmc-etr.c   |  20 +-
 drivers/hwtracing/coresight/coresight-tmc.c   |  16 +-
 drivers/hwtracing/coresight/coresight-tpiu.c  |  32 +-
 drivers/hwtracing/coresight/coresight.c       | 130 +++-
 include/linux/coresight.h                     | 230 +++++-
 21 files changed, 1449 insertions(+), 494 deletions(-)

Comments

Mike Leach Sept. 18, 2020, 3:33 p.m. UTC | #1
Hi Suzuki,

I've looked at the set and have only one real gripe - the
implementation and timing of component detection on the sysreg path.
I've summarised my thoughts here, but as the changes are found across
multiple patches I may well have repeated myself a little in the
individual places.

On Fri, 11 Sep 2020 at 09:41, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>
> CoreSight ETMv4.4 introduced system instructions for accessing
> the ETM. This also implies that they may not be on the amba bus.

System instructions have always been an option - but we have never
supported them up to now. In fact both memory access and system
instructions can live side by side - but the driver really needs to
choose just one!
What did happen is that a PE that supports Arm Trace 8.4 mandates ETM
4.4, and ETM 4.4 mandates system instruction access for PEs with Arm
Trace 8.4, and deprecates memory access.

But there is nothing to stop other variants having the  system
instruction interface. So there is no need to describe this as a
purely 4.4. onwards support - it will support any version of the ETM
that has sysreg access.

The spec permits aarch32 / armv7 register access via CP14 - but I
assume this is omitted deliberately & not intended to be supported at
this time.

> Right now all the CoreSight components are accessed via memory
> map. Also, we have some common routines in coresight generic
> code driver (e.g, CS_LOCK, claim/disclaim), which assume the
> mmio. In order to preserve the generic algorithms at a single
> place and to allow dynamic switch for ETMs, this series introduces
> an abstraction layer for accessing a coresight device. It is
> designed such that the mmio access are fast tracked (i.e, without
> an indirect function call).
>
> This will also help us to get rid of the driver+attribute specific
> sysfs show/store routines and replace them with a single routine
> to access a given register offset (which can be embedded in the
> dev_ext_attribute). This is not currently implemented in the series,
> but can be achieved.
>
> Further we switch the generic routines to work with the abstraction.
> With this in place, we refactor the etm4x code a bit to allow for
> supporting the system instructions with very little new code. The
> changes also switch to using the system instructions by default
> even when we may have an MMIO.
>
> We use TRCDEVARCH for the detection of the ETM component, which
> is a standard register as per CoreSight architecture, rather than
> the etm specific id register TRCIDR3. This is for making sure
> that we are able to detect the ETM via system instructions accurately,
> when the the trace unit could be anything (etm or a custom trace unit).
>

I'm assuming you mean TRCIDR1 here -- which in part, defines the etm
architecture version. TRCIDR3 does something else entirely.
Not sure I agree with this though - the driver is designed to match
the ETM spec so there is no problem with using TRCIDR1 to spot
functional variants according to ETM version,

The etm4_init_arch_data() function is not about detecting the presence
of an ETMv4 component, but about exploring the capabilities it has. We
check 4 bits of the version as a sanity check, but at this point we
should be pretty sure we are dealing with an ETM of some kind.

TRCDEVARCH is already used for detection in the AMBA matching code -
assuming the table includes the optional CoreSIght UCI. I would
imagine that similar detection needs to go on for instruction access -
but once we have detected an ETM, then ETM architected registers are
sufficient. If the device is not an ETM then it should be detected and
rejected early - and the bindings examined to determine why this
driver was attached!

The act of adding in a check against TRCDEVARCH as part of the
etm4_init_arch_data() function adds new and hidden checks to AMBA
devices where it was sufficient to have an entry in the probe match
table before. Most recent additions include the UCI matching, but
older ones don't. I am concerned that this changes may trip up older
existing implementations which for some reason may not have
TRCDEVCARCH, or have set it to not present.

For this reason, I beleive that the TRCDEVARCH check for the sys reg
access should occur on the sysreg specific probe - balancing what
happens on the AMBA side.  That way the common code remains common.
Further the setup of the CSA for the device can happen immediately in
the common etm_probe() function, based on *base being NULL or not,
rather than as a side effect of the etm4_init_arch_data() call.


Regards

Mike


> The series has been mildly tested on a model. I would really
> appreciate any testing on real hardware.
>
> Applies on coresight/next
>
> Changes since V1:
>   - Flip the switch for iomem from no_iomem to io_mem in csdev_access.
>   - Split patches for claim/disclaim and CS_LOCK/UNLOCK conversions.
>   - Move device access initialisation for etm4x to the target CPU
>   - Cleanup secure exception level mask handling.
>   - Switch to use TRCDEVARCH for ETM component discovery. This
>     is for making
>   - Check the availability of OS/Software Locks before using them.
>
>
> Suzuki K Poulose (19):
>   coresight: Introduce device access abstraction
>   coresight: tpiu: Prepare for using coresight device access abstraction
>   coresight: Convert coresight_timeout to use access abstraction
>   coresight: Convert claim/disclaim operations to use access wrappers
>   coresight: Use device access layer for Software lock/unlock operations
>   coresight: etm4x: Always read the registers on the host CPU
>   coresight: etm4x: Convert all register accesses
>   coresight: etm4x: Add commentary on the registers
>   coresight: etm4x: Add sysreg access helpers
>   coresight: etm4x: Define DEVARCH register fields
>   coresight: etm4x: Check for OS and Software Lock
>   coresight: etm4x: Cleanup secure exception level masks
>   coresight: etm4x: Clean up exception level masks
>   coresight: etm4x: Detect access early on the target CPU
>   coresight: etm4x: Use TRCDEVARCH for component discovery
>   coresight: etm4x: Detect system instructions support
>   coresight: etm4x: Refactor probing routine
>   coresight: etm4x: Add support for sysreg only devices
>   dts: bindings: coresight: ETMv4.4 system register access only units
>
>  .../devicetree/bindings/arm/coresight.txt     |   6 +-
>  drivers/hwtracing/coresight/coresight-catu.c  |  24 +-
>  .../hwtracing/coresight/coresight-cpu-debug.c |  22 +-
>  .../hwtracing/coresight/coresight-cti-sysfs.c |   5 +-
>  drivers/hwtracing/coresight/coresight-cti.c   |  34 +-
>  drivers/hwtracing/coresight/coresight-etb10.c |  29 +-
>  .../coresight/coresight-etm3x-sysfs.c         |  10 +-
>  drivers/hwtracing/coresight/coresight-etm3x.c |  35 +-
>  .../coresight/coresight-etm4x-sysfs.c         |  44 +-
>  drivers/hwtracing/coresight/coresight-etm4x.c | 716 +++++++++++-------
>  drivers/hwtracing/coresight/coresight-etm4x.h | 440 ++++++++++-
>  .../hwtracing/coresight/coresight-funnel.c    |  22 +-
>  drivers/hwtracing/coresight/coresight-priv.h  |   9 +-
>  .../coresight/coresight-replicator.c          |  31 +-
>  drivers/hwtracing/coresight/coresight-stm.c   |  50 +-
>  .../hwtracing/coresight/coresight-tmc-etf.c   |  38 +-
>  .../hwtracing/coresight/coresight-tmc-etr.c   |  20 +-
>  drivers/hwtracing/coresight/coresight-tmc.c   |  16 +-
>  drivers/hwtracing/coresight/coresight-tpiu.c  |  32 +-
>  drivers/hwtracing/coresight/coresight.c       | 130 +++-
>  include/linux/coresight.h                     | 230 +++++-
>  21 files changed, 1449 insertions(+), 494 deletions(-)
>
> --
> 2.24.1
>


--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
Suzuki K Poulose Sept. 25, 2020, 9:55 a.m. UTC | #2
Hi Mike,


First of all, thank you so much for your in depth review.
Please find my comments inline.

On 09/18/2020 04:33 PM, Mike Leach wrote:
> Hi Suzuki,
> 
> I've looked at the set and have only one real gripe - the
> implementation and timing of component detection on the sysreg path.
> I've summarised my thoughts here, but as the changes are found across
> multiple patches I may well have repeated myself a little in the
> individual places.
> 
> On Fri, 11 Sep 2020 at 09:41, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>>
>> CoreSight ETMv4.4 introduced system instructions for accessing
>> the ETM. This also implies that they may not be on the amba bus.
> 
> System instructions have always been an option - but we have never
> supported them up to now. In fact both memory access and system
> instructions can live side by side - but the driver really needs to
> choose just one!
> What did happen is that a PE that supports Arm Trace 8.4 mandates ETM
> 4.4, and ETM 4.4 mandates system instruction access for PEs with Arm
> Trace 8.4, and deprecates memory access.
> 
> But there is nothing to stop other variants having the  system
> instruction interface. So there is no need to describe this as a
> purely 4.4. onwards support - it will support any version of the ETM
> that has sysreg access.

Correct, I agree, and will change this.

> 
> The spec permits aarch32 / armv7 register access via CP14 - but I
> assume this is omitted deliberately & not intended to be supported at
> this time.

Yes, because the ETMv4 driver doesn't support arm32 bit at all. We could
definitely add this in the future. I will add it to the commit description.

> 
>> Right now all the CoreSight components are accessed via memory
>> map. Also, we have some common routines in coresight generic
>> code driver (e.g, CS_LOCK, claim/disclaim), which assume the
>> mmio. In order to preserve the generic algorithms at a single
>> place and to allow dynamic switch for ETMs, this series introduces
>> an abstraction layer for accessing a coresight device. It is
>> designed such that the mmio access are fast tracked (i.e, without
>> an indirect function call).
>>
>> This will also help us to get rid of the driver+attribute specific
>> sysfs show/store routines and replace them with a single routine
>> to access a given register offset (which can be embedded in the
>> dev_ext_attribute). This is not currently implemented in the series,
>> but can be achieved.
>>
>> Further we switch the generic routines to work with the abstraction.
>> With this in place, we refactor the etm4x code a bit to allow for
>> supporting the system instructions with very little new code. The
>> changes also switch to using the system instructions by default
>> even when we may have an MMIO.
>>
>> We use TRCDEVARCH for the detection of the ETM component, which
>> is a standard register as per CoreSight architecture, rather than
>> the etm specific id register TRCIDR3. This is for making sure
>> that we are able to detect the ETM via system instructions accurately,
>> when the the trace unit could be anything (etm or a custom trace unit).
>>
> 
> I'm assuming you mean TRCIDR1 here -- which in part, defines the etm
> architecture version. TRCIDR3 does something else entirely.

Yes, definitely, it is a mistake on my side, generated from staring at
both of them TRCIDR1 and TRCIDR3 (for the masks).

> Not sure I agree with this though - the driver is designed to match
> the ETM spec so there is no problem with using TRCIDR1 to spot
> functional variants according to ETM version,

Correct. But for a system instruction based trace unit, we can't trust
just the bindings and must use a CoreSight architected register to
do the basic detection. Also, I am planning to add support the Future
Architectures for the processor trace [0] with the ETM driver, which
mandates the use of TRCDEVARCH for the trace version. So, this is more
of

> 
> The etm4_init_arch_data() function is not about detecting the presence
> of an ETMv4 component, but about exploring the capabilities it has. We
> check 4 bits of the version as a sanity check, but at this point we
> should be pretty sure we are dealing with an ETM of some kind.
> 
> TRCDEVARCH is already used for detection in the AMBA matching code -
> assuming the table includes the optional CoreSIght UCI. I would
> imagine that similar detection needs to go on for instruction access -
> but once we have detected an ETM, then ETM architected registers are
> sufficient. If the device is not an ETM then it should be detected and
> rejected early - and the bindings examined to determine why this
> driver was attached!

I agree with the fact that we should check the device for a supported
type at the earliest and must not trust the bindings. With the AMBA
based devices we have the early check as mentioned above via the PIDs
and the UCI (where available). But where the UCI is not listed, these
will be caught by the additional checks on the TRCIDR1 fields. e.g,
CTI could have the same PID as an ETM4 and without the UCI field,
the driver could assume that an ETM is CTI if the firmware was
incorrect.

> 
> The act of adding in a check against TRCDEVARCH as part of the
> etm4_init_arch_data() function adds new and hidden checks to AMBA
> devices where it was sufficient to have an entry in the probe match
> table before. Most recent additions include the UCI matching, but
> older ones don't. I am concerned that this changes may trip up older
> existing implementations which for some reason may not have
> TRCDEVCARCH, or have set it to not present.

For the records, ETMv4.0 revision A, says, the PRESENT bit is always
Read As One (RAO). So, if they don't implement it or have set to 0,
that means that they are broken. But, we could gracefully handle it
if the PRESENT bit is 0 and fold back to TRCIDR1.

> 
> For this reason, I beleive that the TRCDEVARCH check for the sys reg
> access should occur on the sysreg specific probe - balancing what
> happens on the AMBA side.  That way the common code remains common.

To make the current situation clear, for those who have not looked
at the series, here is the summary :

1) AMBA driver checks the PIDs to match a device to known ETMv4.
    Note that CTIs could share the same PIDs and thus we added additional
    check on the UCI (which is TRCDEVARCH) field for some of the ETMs.

2) The ETM4 driver assumes that the component is ETMv4 and calls
    etmv4_init_arch_data() and probes the ETM4 for features, filling
    in drvdata, including the TRCIDR1. Please note that, at this point
    there is no guarantee that the unit is indeed ETMv4, if the UCI
    check (TRCDEVARCH) has not been performed. So, we are possibly
    treading into wild land here (at least on bring up).

3) The etm4_probe() confirms that the ETM4 architecture is supported
    by checking the TRCIDR1 fields (stored in drvdata->arch from step (2)).
    This check is important (at least for bring up), because if the
    UCI check is not added for the component, a CTI could be mistaken
    for an ETM with AMBA devices.

Fact : TRCDEVARCH must be implemented and represent that the component is
ETMv4 (this is the basis of UCI check) since ETMv4.0 specification.


For the system instructions based devices, we have :
1) Device tree compatible to advertise the presence of a trace unit on
    a CPU. (no PID checks, this is good, because you don't have to
    add an entry for a new CPU to be supported upstream, as long as
    it is compliant with the ETMv4).

2) To confirm that the CPU tracing unit is ETMv4 compatible, we need
    to use CoreSight architected register, TRCDEVARCH (the same as UCI).
    This is because TRCIDR1 may not be what is expected if the Trace unit
    is not ETMv4, since the encoding is ETM specific. And this must be
    performed on the host CPU (just like etm4_init_arch_data).

With this series:

   * AMBA devices pass through the PID check as usual. But the sysreg
     devices jumps straight to etm4_init_arch_data() via common etm4_probe.

   * etm4_init_arch_data() will verify that the component is ETMv4
     by verifying the TRCDEVARCH for all ETMs (both AMBA and system
     instructions), before poking the features. This will avoid having
     to do another round of smp_func_call() in etm4 sysreg probe code.
      - Note: As per Mike's suggestion, this can be further relaxed to check
        TRCIDR1 for AMBA devices, iff TRCDEVARCH is marked absent for
        supporting any wild broken implementations out there (I prefer to add a
        pr_warn_once() for such cases, so that we know such units).


> Further the setup of the CSA for the device can happen immediately in
> the common etm_probe() function, based on *base being NULL or not,

For now, yes. But with the additional changes for supporting [0], this
may not. As we need to really see if we have an ETMv4.x or a future
unit which has slightly different register list.

> rather than as a side effect of the etm4_init_arch_data() call.

[0] https://developer.arm.com/docs/ddi0601/latest , See TRCIDR1

Kind regards
Suzuki
Mike Leach Sept. 29, 2020, 4:42 p.m. UTC | #3
Hi Suzuki,

On Fri, 25 Sep 2020 at 10:50, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>
> Hi Mike,
>
>
> First of all, thank you so much for your in depth review.
> Please find my comments inline.
>
> On 09/18/2020 04:33 PM, Mike Leach wrote:
> > Hi Suzuki,
> >
> > I've looked at the set and have only one real gripe - the
> > implementation and timing of component detection on the sysreg path.
> > I've summarised my thoughts here, but as the changes are found across
> > multiple patches I may well have repeated myself a little in the
> > individual places.
> >
> > On Fri, 11 Sep 2020 at 09:41, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
> >>
> >> CoreSight ETMv4.4 introduced system instructions for accessing
> >> the ETM. This also implies that they may not be on the amba bus.
> >
> > System instructions have always been an option - but we have never
> > supported them up to now. In fact both memory access and system
> > instructions can live side by side - but the driver really needs to
> > choose just one!
> > What did happen is that a PE that supports Arm Trace 8.4 mandates ETM
> > 4.4, and ETM 4.4 mandates system instruction access for PEs with Arm
> > Trace 8.4, and deprecates memory access.
> >
> > But there is nothing to stop other variants having the  system
> > instruction interface. So there is no need to describe this as a
> > purely 4.4. onwards support - it will support any version of the ETM
> > that has sysreg access.
>
> Correct, I agree, and will change this.
>
> >
> > The spec permits aarch32 / armv7 register access via CP14 - but I
> > assume this is omitted deliberately & not intended to be supported at
> > this time.
>
> Yes, because the ETMv4 driver doesn't support arm32 bit at all. We could
> definitely add this in the future. I will add it to the commit description.
>
> >
> >> Right now all the CoreSight components are accessed via memory
> >> map. Also, we have some common routines in coresight generic
> >> code driver (e.g, CS_LOCK, claim/disclaim), which assume the
> >> mmio. In order to preserve the generic algorithms at a single
> >> place and to allow dynamic switch for ETMs, this series introduces
> >> an abstraction layer for accessing a coresight device. It is
> >> designed such that the mmio access are fast tracked (i.e, without
> >> an indirect function call).
> >>
> >> This will also help us to get rid of the driver+attribute specific
> >> sysfs show/store routines and replace them with a single routine
> >> to access a given register offset (which can be embedded in the
> >> dev_ext_attribute). This is not currently implemented in the series,
> >> but can be achieved.
> >>
> >> Further we switch the generic routines to work with the abstraction.
> >> With this in place, we refactor the etm4x code a bit to allow for
> >> supporting the system instructions with very little new code. The
> >> changes also switch to using the system instructions by default
> >> even when we may have an MMIO.
> >>
> >> We use TRCDEVARCH for the detection of the ETM component, which
> >> is a standard register as per CoreSight architecture, rather than
> >> the etm specific id register TRCIDR3. This is for making sure
> >> that we are able to detect the ETM via system instructions accurately,
> >> when the the trace unit could be anything (etm or a custom trace unit).
> >>
> >
> > I'm assuming you mean TRCIDR1 here -- which in part, defines the etm
> > architecture version. TRCIDR3 does something else entirely.
>
> Yes, definitely, it is a mistake on my side, generated from staring at
> both of them TRCIDR1 and TRCIDR3 (for the masks).
>
> > Not sure I agree with this though - the driver is designed to match
> > the ETM spec so there is no problem with using TRCIDR1 to spot
> > functional variants according to ETM version,
>
> Correct. But for a system instruction based trace unit, we can't trust
> just the bindings and must use a CoreSight architected register to
> do the basic detection. Also, I am planning to add support the Future
> Architectures for the processor trace [0] with the ETM driver, which
> mandates the use of TRCDEVARCH for the trace version. So, this is more
> of
>

Agreed - sysreg must do its own validation of the connected component.

> >
> > The etm4_init_arch_data() function is not about detecting the presence
> > of an ETMv4 component, but about exploring the capabilities it has. We
> > check 4 bits of the version as a sanity check, but at this point we
> > should be pretty sure we are dealing with an ETM of some kind.
> >
> > TRCDEVARCH is already used for detection in the AMBA matching code -
> > assuming the table includes the optional CoreSIght UCI. I would
> > imagine that similar detection needs to go on for instruction access -
> > but once we have detected an ETM, then ETM architected registers are
> > sufficient. If the device is not an ETM then it should be detected and
> > rejected early - and the bindings examined to determine why this
> > driver was attached!
>
> I agree with the fact that we should check the device for a supported
> type at the earliest and must not trust the bindings. With the AMBA
> based devices we have the early check as mentioned above via the PIDs
> and the UCI (where available). But where the UCI is not listed, these
> will be caught by the additional checks on the TRCIDR1 fields. e.g,
> CTI could have the same PID as an ETM4 and without the UCI field,
> the driver could assume that an ETM is CTI if the firmware was
> incorrect

The TRCDEVARCH register was introduced as part of CoreSight
Architecture 2.0, the UCI concept as part of CoreSight Architecture
3.0.
When we added the UCI checks to the AMBA matching path, this was for
newer components that we knew were following the Coresight 3.0 concept
of having the same part number for components in the same common
function - i.e. ETM, CTI, PMU, with a given CPU.

There was no way of knowing if 100% of the older components that
matched the driver using only part number would also pass the UCI
check, so this was not added for them (and certainly early CTIs never
had a TRCDEVARCH). We did not want to break compatibility, given that
some of those components may not be ARM designed. I am not sure what
has changed such that we are now prepared to risk making the driver
incompatible with older devices.

> >
> > The act of adding in a check against TRCDEVARCH as part of the
> > etm4_init_arch_data() function adds new and hidden checks to AMBA
> > devices where it was sufficient to have an entry in the probe match
> > table before. Most recent additions include the UCI matching, but
> > older ones don't. I am concerned that this changes may trip up older
> > existing implementations which for some reason may not have
> > TRCDEVCARCH, or have set it to not present.
>
> For the records, ETMv4.0 revision A, says, the PRESENT bit is always
> Read As One (RAO). So, if they don't implement it or have set to 0,
> that means that they are broken. But, we could gracefully handle it
> if the PRESENT bit is 0 and fold back to TRCIDR1.

>
> >
> > For this reason, I beleive that the TRCDEVARCH check for the sys reg
> > access should occur on the sysreg specific probe - balancing what
> > happens on the AMBA side.  That way the common code remains common.
>
> To make the current situation clear, for those who have not looked
> at the series, here is the summary :
>
> 1) AMBA driver checks the PIDs to match a device to known ETMv4.
>     Note that CTIs could share the same PIDs and thus we added additional
>     check on the UCI (which is TRCDEVARCH) field for some of the ETMs.
>
> 2) The ETM4 driver assumes that the component is ETMv4 and calls
>     etmv4_init_arch_data() and probes the ETM4 for features, filling
>     in drvdata, including the TRCIDR1. Please note that, at this point
>     there is no guarantee that the unit is indeed ETMv4, if the UCI
>     check (TRCDEVARCH) has not been performed. So, we are possibly
>     treading into wild land here (at least on bring up).
>
> 3) The etm4_probe() confirms that the ETM4 architecture is supported
>     by checking the TRCIDR1 fields (stored in drvdata->arch from step (2)).
>     This check is important (at least for bring up), because if the
>     UCI check is not added for the component, a CTI could be mistaken
>     for an ETM with AMBA devices.
>
> Fact : TRCDEVARCH must be implemented and represent that the component is
> ETMv4 (this is the basis of UCI check) since ETMv4.0 specification.
>

Agreed - but unfortunately in the real world not every component
correctly follows the spec - hence my concern about tightening up
checks on existing devices.

>
> For the system instructions based devices, we have :
> 1) Device tree compatible to advertise the presence of a trace unit on
>     a CPU. (no PID checks, this is good, because you don't have to
>     add an entry for a new CPU to be supported upstream, as long as
>     it is compliant with the ETMv4).
>
> 2) To confirm that the CPU tracing unit is ETMv4 compatible, we need
>     to use CoreSight architected register, TRCDEVARCH (the same as UCI).
>     This is because TRCIDR1 may not be what is expected if the Trace unit
>     is not ETMv4, since the encoding is ETM specific. And this must be
>     performed on the host CPU (just like etm4_init_arch_data).
>
> With this series:
>
>    * AMBA devices pass through the PID check as usual. But the sysreg
>      devices jumps straight to etm4_init_arch_data() via common etm4_probe.
>
>    * etm4_init_arch_data() will verify that the component is ETMv4
>      by verifying the TRCDEVARCH for all ETMs (both AMBA and system
>      instructions), before poking the features. This will avoid having
>      to do another round of smp_func_call() in etm4 sysreg probe code.

OK - so using the existing function avoids having two smp_func_call()
instances - at the cost of increased complexity in the overall code.
If you think this is worth it then fair enough.


>       - Note: As per Mike's suggestion, this can be further relaxed to check
>         TRCIDR1 for AMBA devices, iff TRCDEVARCH is marked absent for
>         supporting any wild broken implementations out there (I prefer to add a
>         pr_warn_once() for such cases, so that we know such units).
>
>
> > Further the setup of the CSA for the device can happen immediately in
> > the common etm_probe() function, based on *base being NULL or not,
>
> For now, yes. But with the additional changes for supporting [0], this
> may not. As we need to really see if we have an ETMv4.x or a future
> unit which has slightly different register list.
>

Not sure I understand here - are you saying that there may be a case
where sysreg access could have a base memory address as well? After
ETM4.4 the two have to be mutually exclusive, and even where they are
not, there is no way the driver should be using both memory and system
register access. The case in [0] may indicate an extension to the
programming model, e.g. we currently check if the ETM version is >=
4.3 as this will affect the interpretation of resource selector
numbers, and future trace devices may well have more programming model
alterations - but this should not affect the access model.

Regards

Mike


> > rather than as a side effect of the etm4_init_arch_data() call.
>
> [0] https://developer.arm.com/docs/ddi0601/latest , See TRCIDR1
>
> Kind regards
> Suzuki
>
>
>
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK