Message ID | 467597048eda3004bd69f1fbe3981aab111e00dd.1455810755.git.jglauber@cavium.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Feb 18, 2016 at 05:50:13PM +0100, Jan Glauber wrote: > With the long cycle counter bit (LC) disabled the cycle counter is not > working on ThunderX SOC (ThunderX only implements Aarch64). > Also, according to documentation LC == 0 is deprecated. > > To keep the code simple the patch does not introduce 64 bit wide counter > functions. Instead writing the cycle counter always sets the upper > 32 bits so overflow interrupts are generated as before. > > Original patch from Andrew Pinksi <Andrew.Pinksi@caviumnetworks.com> What does this mean? Do we need Andrew's S-o-B, or is this a fresh patch? Will
On Thu, Feb 18, 2016 at 05:34:28PM +0000, Will Deacon wrote: > On Thu, Feb 18, 2016 at 05:50:13PM +0100, Jan Glauber wrote: > > With the long cycle counter bit (LC) disabled the cycle counter is not > > working on ThunderX SOC (ThunderX only implements Aarch64). > > Also, according to documentation LC == 0 is deprecated. > > > > To keep the code simple the patch does not introduce 64 bit wide counter > > functions. Instead writing the cycle counter always sets the upper > > 32 bits so overflow interrupts are generated as before. > > > > Original patch from Andrew Pinksi <Andrew.Pinksi@caviumnetworks.com> > > What does this mean? Do we need Andrew's S-o-B, or is this a fresh patch? > > Will I've modified Andrew's patch. I assumed his formal S-o-B is not required. Please correct me if I'm wrong. Jan
On 02/18/2016 09:34 AM, Will Deacon wrote: > On Thu, Feb 18, 2016 at 05:50:13PM +0100, Jan Glauber wrote: >> With the long cycle counter bit (LC) disabled the cycle counter is not >> working on ThunderX SOC (ThunderX only implements Aarch64). >> Also, according to documentation LC == 0 is deprecated. >> >> To keep the code simple the patch does not introduce 64 bit wide counter >> functions. Instead writing the cycle counter always sets the upper >> 32 bits so overflow interrupts are generated as before. >> >> Original patch from Andrew Pinksi <Andrew.Pinksi@caviumnetworks.com> > > What does this mean? Do we need Andrew's S-o-B, or is this a fresh patch? I don't believe we need Andrew's S-o-B as the assertion of the Developer's Certificate of Origin 1.1 clauses (a), (b) and (d) is being made. Specifically, clause (c) does not apply. However this may be a gray area, so we could put on Andrew's S-o-B if that would make everybody happier. David Daney > > Will > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >
On Thu, Feb 18, 2016 at 05:34:28PM +0000, Will Deacon wrote: > On Thu, Feb 18, 2016 at 05:50:13PM +0100, Jan Glauber wrote: > > With the long cycle counter bit (LC) disabled the cycle counter is not > > working on ThunderX SOC (ThunderX only implements Aarch64). > > Also, according to documentation LC == 0 is deprecated. > > > > To keep the code simple the patch does not introduce 64 bit wide counter > > functions. Instead writing the cycle counter always sets the upper > > 32 bits so overflow interrupts are generated as before. > > > > Original patch from Andrew Pinksi <Andrew.Pinksi@caviumnetworks.com> > > What does this mean? Do we need Andrew's S-o-B, or is this a fresh patch? Hi Will, Please let me know if I should repost or not, FWIW I got Andrew's S-o-B on the patch. Thanks, Jan > Will
On Mon, Feb 22, 2016 at 01:45:14PM +0100, Jan Glauber wrote: > On Thu, Feb 18, 2016 at 05:34:28PM +0000, Will Deacon wrote: > > On Thu, Feb 18, 2016 at 05:50:13PM +0100, Jan Glauber wrote: > > > With the long cycle counter bit (LC) disabled the cycle counter is not > > > working on ThunderX SOC (ThunderX only implements Aarch64). > > > Also, according to documentation LC == 0 is deprecated. > > > > > > To keep the code simple the patch does not introduce 64 bit wide counter > > > functions. Instead writing the cycle counter always sets the upper > > > 32 bits so overflow interrupts are generated as before. > > > > > > Original patch from Andrew Pinksi <Andrew.Pinksi@caviumnetworks.com> > > > > What does this mean? Do we need Andrew's S-o-B, or is this a fresh patch? > > Please let me know if I should repost or not, FWIW I got Andrew's S-o-B on the > patch. I think it's fine. This should all be in -next as of last Friday anyhow. Will
Hi Jan, I've queued this lot on my perf/updates branch, but I just noticed an oddity whilst dealing with some potential conflicts with the kvm tree. On Thu, Feb 18, 2016 at 05:50:13PM +0100, Jan Glauber wrote: > With the long cycle counter bit (LC) disabled the cycle counter is not > working on ThunderX SOC (ThunderX only implements Aarch64). > Also, according to documentation LC == 0 is deprecated. > > To keep the code simple the patch does not introduce 64 bit wide counter > functions. Instead writing the cycle counter always sets the upper > 32 bits so overflow interrupts are generated as before. > > Original patch from Andrew Pinksi <Andrew.Pinksi@caviumnetworks.com> > > Signed-off-by: Jan Glauber <jglauber@cavium.com> > --- > arch/arm64/kernel/perf_event.c | 21 ++++++++++++++++----- > 1 file changed, 16 insertions(+), 5 deletions(-) > > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c > index 0ed05f6..c68fa98 100644 > --- a/arch/arm64/kernel/perf_event.c > +++ b/arch/arm64/kernel/perf_event.c > @@ -405,6 +405,7 @@ static const struct attribute_group *armv8_pmuv3_attr_groups[] = { > #define ARMV8_PMCR_D (1 << 3) /* CCNT counts every 64th cpu cycle */ > #define ARMV8_PMCR_X (1 << 4) /* Export to ETM */ > #define ARMV8_PMCR_DP (1 << 5) /* Disable CCNT if non-invasive debug*/ > +#define ARMV8_PMCR_LC (1 << 6) /* Overflow on 64 bit cycle counter */ > #define ARMV8_PMCR_N_SHIFT 11 /* Number of counters supported */ > #define ARMV8_PMCR_N_MASK 0x1f > #define ARMV8_PMCR_MASK 0x3f /* Mask for writable bits */ You haven't extended this mask to cover the LC bit, so it will be ignored by armv8pmu_pmcr_write afaict. How did you test this? I can easily update the mask, but it would be good to know that it doesn't end up cause a breakage. Will
On Mon, Feb 29, 2016 at 03:39:35PM +0000, Will Deacon wrote: > Hi Jan, > > I've queued this lot on my perf/updates branch, but I just noticed an > oddity whilst dealing with some potential conflicts with the kvm tree. > > On Thu, Feb 18, 2016 at 05:50:13PM +0100, Jan Glauber wrote: > > With the long cycle counter bit (LC) disabled the cycle counter is not > > working on ThunderX SOC (ThunderX only implements Aarch64). > > Also, according to documentation LC == 0 is deprecated. > > > > To keep the code simple the patch does not introduce 64 bit wide counter > > functions. Instead writing the cycle counter always sets the upper > > 32 bits so overflow interrupts are generated as before. > > > > Original patch from Andrew Pinksi <Andrew.Pinksi@caviumnetworks.com> > > > > Signed-off-by: Jan Glauber <jglauber@cavium.com> > > --- > > arch/arm64/kernel/perf_event.c | 21 ++++++++++++++++----- > > 1 file changed, 16 insertions(+), 5 deletions(-) > > > > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c > > index 0ed05f6..c68fa98 100644 > > --- a/arch/arm64/kernel/perf_event.c > > +++ b/arch/arm64/kernel/perf_event.c > > @@ -405,6 +405,7 @@ static const struct attribute_group *armv8_pmuv3_attr_groups[] = { > > #define ARMV8_PMCR_D (1 << 3) /* CCNT counts every 64th cpu cycle */ > > #define ARMV8_PMCR_X (1 << 4) /* Export to ETM */ > > #define ARMV8_PMCR_DP (1 << 5) /* Disable CCNT if non-invasive debug*/ > > +#define ARMV8_PMCR_LC (1 << 6) /* Overflow on 64 bit cycle counter */ > > #define ARMV8_PMCR_N_SHIFT 11 /* Number of counters supported */ > > #define ARMV8_PMCR_N_MASK 0x1f > > #define ARMV8_PMCR_MASK 0x3f /* Mask for writable bits */ > > You haven't extended this mask to cover the LC bit, so it will be ignored > by armv8pmu_pmcr_write afaict. This is weird. I've double checked and I missed this mask. Annoying. Nevertheless it works for me without the LC bit set. > How did you test this? I can easily update the mask, but it would be > good to know that it doesn't end up cause a breakage. For testing I used: - perf top and perf record & report - looked at interrupt numbers in /proc/interrupts Without the patch _no_ samples at all are recorded and the interrupt does not occur. With the patch I get samples and see a reasonable number of interrupts. Extending the mask so the LC bit is covered would make sense, I'm going to test this now. Jan > Will
On Mon, Feb 29, 2016 at 03:39:35PM +0000, Will Deacon wrote: > Hi Jan, > > I've queued this lot on my perf/updates branch, but I just noticed an > oddity whilst dealing with some potential conflicts with the kvm tree. > > On Thu, Feb 18, 2016 at 05:50:13PM +0100, Jan Glauber wrote: > > With the long cycle counter bit (LC) disabled the cycle counter is not > > working on ThunderX SOC (ThunderX only implements Aarch64). > > Also, according to documentation LC == 0 is deprecated. > > > > To keep the code simple the patch does not introduce 64 bit wide counter > > functions. Instead writing the cycle counter always sets the upper > > 32 bits so overflow interrupts are generated as before. > > > > Original patch from Andrew Pinksi <Andrew.Pinksi@caviumnetworks.com> > > > > Signed-off-by: Jan Glauber <jglauber@cavium.com> > > --- > > arch/arm64/kernel/perf_event.c | 21 ++++++++++++++++----- > > 1 file changed, 16 insertions(+), 5 deletions(-) > > > > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c > > index 0ed05f6..c68fa98 100644 > > --- a/arch/arm64/kernel/perf_event.c > > +++ b/arch/arm64/kernel/perf_event.c > > @@ -405,6 +405,7 @@ static const struct attribute_group *armv8_pmuv3_attr_groups[] = { > > #define ARMV8_PMCR_D (1 << 3) /* CCNT counts every 64th cpu cycle */ > > #define ARMV8_PMCR_X (1 << 4) /* Export to ETM */ > > #define ARMV8_PMCR_DP (1 << 5) /* Disable CCNT if non-invasive debug*/ > > +#define ARMV8_PMCR_LC (1 << 6) /* Overflow on 64 bit cycle counter */ > > #define ARMV8_PMCR_N_SHIFT 11 /* Number of counters supported */ > > #define ARMV8_PMCR_N_MASK 0x1f > > #define ARMV8_PMCR_MASK 0x3f /* Mask for writable bits */ > > You haven't extended this mask to cover the LC bit, so it will be ignored > by armv8pmu_pmcr_write afaict. > > How did you test this? I can easily update the mask, but it would be > good to know that it doesn't end up cause a breakage. Please update the mask, I've tested with ARMV8_PMCR_MASK set to 0x7f and it works fine. It seems like it would work even without the LC bit set because we set the upper bits again after an interrupt, but reading the documentation we should really use ARMV8_PMCR_LC. thanks, Jan > Will
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index 0ed05f6..c68fa98 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -405,6 +405,7 @@ static const struct attribute_group *armv8_pmuv3_attr_groups[] = { #define ARMV8_PMCR_D (1 << 3) /* CCNT counts every 64th cpu cycle */ #define ARMV8_PMCR_X (1 << 4) /* Export to ETM */ #define ARMV8_PMCR_DP (1 << 5) /* Disable CCNT if non-invasive debug*/ +#define ARMV8_PMCR_LC (1 << 6) /* Overflow on 64 bit cycle counter */ #define ARMV8_PMCR_N_SHIFT 11 /* Number of counters supported */ #define ARMV8_PMCR_N_MASK 0x1f #define ARMV8_PMCR_MASK 0x3f /* Mask for writable bits */ @@ -494,9 +495,16 @@ static inline void armv8pmu_write_counter(struct perf_event *event, u32 value) if (!armv8pmu_counter_valid(cpu_pmu, idx)) pr_err("CPU%u writing wrong counter %d\n", smp_processor_id(), idx); - else if (idx == ARMV8_IDX_CYCLE_COUNTER) - asm volatile("msr pmccntr_el0, %0" :: "r" (value)); - else if (armv8pmu_select_counter(idx) == idx) + else if (idx == ARMV8_IDX_CYCLE_COUNTER) { + /* + * Set the upper 32bits as this is a 64bit counter but we only + * count using the lower 32bits and we want an interrupt when + * it overflows. + */ + u64 value64 = 0xffffffff00000000ULL | value; + + asm volatile("msr pmccntr_el0, %0" :: "r" (value64)); + } else if (armv8pmu_select_counter(idx) == idx) asm volatile("msr pmxevcntr_el0, %0" :: "r" (value)); } @@ -768,8 +776,11 @@ static void armv8pmu_reset(void *info) armv8pmu_disable_intens(idx); } - /* Initialize & Reset PMNC: C and P bits. */ - armv8pmu_pmcr_write(ARMV8_PMCR_P | ARMV8_PMCR_C); + /* + * Initialize & Reset PMNC. Request overflow interrupt for + * 64 bit cycle counter but cheat in armv8pmu_write_counter(). + */ + armv8pmu_pmcr_write(ARMV8_PMCR_P | ARMV8_PMCR_C | ARMV8_PMCR_LC); } static int armv8_pmuv3_map_event(struct perf_event *event)
With the long cycle counter bit (LC) disabled the cycle counter is not working on ThunderX SOC (ThunderX only implements Aarch64). Also, according to documentation LC == 0 is deprecated. To keep the code simple the patch does not introduce 64 bit wide counter functions. Instead writing the cycle counter always sets the upper 32 bits so overflow interrupts are generated as before. Original patch from Andrew Pinksi <Andrew.Pinksi@caviumnetworks.com> Signed-off-by: Jan Glauber <jglauber@cavium.com> --- arch/arm64/kernel/perf_event.c | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-)