diff mbox

[RFC,v2,02/19] arm64: Use the physical counter when available for read_cycles

Message ID 20170717142718.13853-3-cdall@linaro.org (mailing list archive)
State New, archived
Headers show

Commit Message

Christoffer Dall July 17, 2017, 2:27 p.m. UTC
Currently get_cycles() is hardwired to arch_counter_get_cntvct() on
arm64, but as we move to using the physical timer for the in-kernel
time-keeping, we need to make that more flexible.

First, we need to make sure the physical counter can be read on equal
terms to the virtual counter, which includes adding physical counter
read functions for timers that require errata.

Second, we need to make a choice between reading the physical vs virtual
counter, depending on which timer is used for time keeping in the kernel
otherwise.  We can do this using a static key to avoid a performance
penalty during runtime when reading the counter.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
---
 arch/arm64/include/asm/arch_timer.h  | 18 ++++++++++++------
 arch/arm64/include/asm/timex.h       |  2 +-
 drivers/clocksource/arm_arch_timer.c | 32 ++++++++++++++++++++++++++++++--
 3 files changed, 43 insertions(+), 9 deletions(-)

Comments

Will Deacon July 25, 2017, 9:43 a.m. UTC | #1
Hi Christoffer,

On Mon, Jul 17, 2017 at 04:27:01PM +0200, Christoffer Dall wrote:
> Currently get_cycles() is hardwired to arch_counter_get_cntvct() on
> arm64, but as we move to using the physical timer for the in-kernel
> time-keeping, we need to make that more flexible.
> 
> First, we need to make sure the physical counter can be read on equal
> terms to the virtual counter, which includes adding physical counter
> read functions for timers that require errata.
> 
> Second, we need to make a choice between reading the physical vs virtual
> counter, depending on which timer is used for time keeping in the kernel
> otherwise.  We can do this using a static key to avoid a performance
> penalty during runtime when reading the counter.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Christoffer Dall <cdall@linaro.org>
> ---
>  arch/arm64/include/asm/arch_timer.h  | 18 ++++++++++++------
>  arch/arm64/include/asm/timex.h       |  2 +-
>  drivers/clocksource/arm_arch_timer.c | 32 ++++++++++++++++++++++++++++++--
>  3 files changed, 43 insertions(+), 9 deletions(-)

[...]

> @@ -886,10 +912,12 @@ static void __init arch_counter_register(unsigned type)
>  
>  	/* Register the CP15 based counter if we have one */
>  	if (type & ARCH_TIMER_TYPE_CP15) {
> -		if (arch_timer_uses_ppi == ARCH_TIMER_VIRT_PPI)
> +		if (arch_timer_uses_ppi == ARCH_TIMER_VIRT_PPI) {
>  			arch_timer_read_counter = arch_counter_get_cntvct;
> -		else
> +		} else {
>  			arch_timer_read_counter = arch_counter_get_cntpct;
> +			static_branch_enable(&arch_timer_phys_counter_available);
> +		}

I'm a bit worried about this change, although I can't put my finger on
exactly the problematic scenario. My concern is that if we have a system
where the host kernel is entered at NS-EL1 (because, e.g. EL2 is used for
something else or the bootloader just didn't load us there) then the booting
protocol doesn't mandate a zero-initialised CNTVOFF value. If we can
subsequently end up using the physical counter in the kernel and the virtual
counter in userspace, the vDSO will get confused because the datapage values
will not correspond to the values it actually ends up reading. There's also
the likelihood that existing EL2 init code simply isn't setting up
CNTHCTL_EL2 and CNTVOFF correctly, so we probably need a way to force
virtual counter on the cmdline.

In practice it looks like we always end up with ARCH_TIMER_VIRT_PPI out of
arch_timer_select_ppi, but that's not guaranteed and I haven't thought at
all about the 32-bit case, which has other quirks/complexities.

Will
Christoffer Dall July 25, 2017, 2:36 p.m. UTC | #2
On Tue, Jul 25, 2017 at 10:43:08AM +0100, Will Deacon wrote:
> Hi Christoffer,
> 
> On Mon, Jul 17, 2017 at 04:27:01PM +0200, Christoffer Dall wrote:
> > Currently get_cycles() is hardwired to arch_counter_get_cntvct() on
> > arm64, but as we move to using the physical timer for the in-kernel
> > time-keeping, we need to make that more flexible.
> > 
> > First, we need to make sure the physical counter can be read on equal
> > terms to the virtual counter, which includes adding physical counter
> > read functions for timers that require errata.
> > 
> > Second, we need to make a choice between reading the physical vs virtual
> > counter, depending on which timer is used for time keeping in the kernel
> > otherwise.  We can do this using a static key to avoid a performance
> > penalty during runtime when reading the counter.
> > 
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will.deacon@arm.com>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: Marc Zyngier <marc.zyngier@arm.com>
> > Signed-off-by: Christoffer Dall <cdall@linaro.org>
> > ---
> >  arch/arm64/include/asm/arch_timer.h  | 18 ++++++++++++------
> >  arch/arm64/include/asm/timex.h       |  2 +-
> >  drivers/clocksource/arm_arch_timer.c | 32 ++++++++++++++++++++++++++++++--
> >  3 files changed, 43 insertions(+), 9 deletions(-)
> 
> [...]
> 
> > @@ -886,10 +912,12 @@ static void __init arch_counter_register(unsigned type)
> >  
> >  	/* Register the CP15 based counter if we have one */
> >  	if (type & ARCH_TIMER_TYPE_CP15) {
> > -		if (arch_timer_uses_ppi == ARCH_TIMER_VIRT_PPI)
> > +		if (arch_timer_uses_ppi == ARCH_TIMER_VIRT_PPI) {
> >  			arch_timer_read_counter = arch_counter_get_cntvct;
> > -		else
> > +		} else {
> >  			arch_timer_read_counter = arch_counter_get_cntpct;
> > +			static_branch_enable(&arch_timer_phys_counter_available);
> > +		}
> 
> I'm a bit worried about this change, although I can't put my finger on
> exactly the problematic scenario. My concern is that if we have a system
> where the host kernel is entered at NS-EL1 (because, e.g. EL2 is used for
> something else or the bootloader just didn't load us there) then the booting
> protocol doesn't mandate a zero-initialised CNTVOFF value. If we can
> subsequently end up using the physical counter in the kernel and the virtual
> counter in userspace, the vDSO will get confused because the datapage values
> will not correspond to the values it actually ends up reading. 

Just so I'm sure I'm understanding correctly, vDSO always reads the
virtual counter?

> There's also
> the likelihood that existing EL2 init code simply isn't setting up
> CNTHCTL_EL2 and CNTVOFF correctly, so we probably need a way to force
> virtual counter on the cmdline.

My understanding was that we only ever use the physical counter when we
boot at EL2 and therefore the kernel is in control of CNTVOFF and can
set that to 0.  Is this not the case, or are you asking for a way to
mandate this relationship or make it abundantly clear?

Also, are you fine with arch_timer_read_counter changing to using the
physical counter on arm64, but you're merely worried about
read_cycles()?

Does it make sense to have read_cycles() still read the virtual counter
whereas arch timers (timers, not counters) use the physical counter?  I
somewhat convinced myself that this doesn't work though.

> 
> In practice it looks like we always end up with ARCH_TIMER_VIRT_PPI out of
> arch_timer_select_ppi, but that's not guaranteed and I haven't thought at
> all about the 32-bit case, which has other quirks/complexities.
> 

How does this change affect the 32-bit side?  All this should do is
enable a static branch which is unused on the 32-bit side; what am I
missing?

Thanks,
-Christoffer
Will Deacon July 26, 2017, 5:17 p.m. UTC | #3
On Tue, Jul 25, 2017 at 07:36:47AM -0700, Christoffer Dall wrote:
> On Tue, Jul 25, 2017 at 10:43:08AM +0100, Will Deacon wrote:
> > On Mon, Jul 17, 2017 at 04:27:01PM +0200, Christoffer Dall wrote:
> > > Currently get_cycles() is hardwired to arch_counter_get_cntvct() on
> > > arm64, but as we move to using the physical timer for the in-kernel
> > > time-keeping, we need to make that more flexible.
> > > 
> > > First, we need to make sure the physical counter can be read on equal
> > > terms to the virtual counter, which includes adding physical counter
> > > read functions for timers that require errata.
> > > 
> > > Second, we need to make a choice between reading the physical vs virtual
> > > counter, depending on which timer is used for time keeping in the kernel
> > > otherwise.  We can do this using a static key to avoid a performance
> > > penalty during runtime when reading the counter.
> > > 
> > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > Cc: Will Deacon <will.deacon@arm.com>
> > > Cc: Mark Rutland <mark.rutland@arm.com>
> > > Cc: Marc Zyngier <marc.zyngier@arm.com>
> > > Signed-off-by: Christoffer Dall <cdall@linaro.org>
> > > ---
> > >  arch/arm64/include/asm/arch_timer.h  | 18 ++++++++++++------
> > >  arch/arm64/include/asm/timex.h       |  2 +-
> > >  drivers/clocksource/arm_arch_timer.c | 32 ++++++++++++++++++++++++++++++--
> > >  3 files changed, 43 insertions(+), 9 deletions(-)
> > 
> > [...]
> > 
> > > @@ -886,10 +912,12 @@ static void __init arch_counter_register(unsigned type)
> > >  
> > >  	/* Register the CP15 based counter if we have one */
> > >  	if (type & ARCH_TIMER_TYPE_CP15) {
> > > -		if (arch_timer_uses_ppi == ARCH_TIMER_VIRT_PPI)
> > > +		if (arch_timer_uses_ppi == ARCH_TIMER_VIRT_PPI) {
> > >  			arch_timer_read_counter = arch_counter_get_cntvct;
> > > -		else
> > > +		} else {
> > >  			arch_timer_read_counter = arch_counter_get_cntpct;
> > > +			static_branch_enable(&arch_timer_phys_counter_available);
> > > +		}
> > 
> > I'm a bit worried about this change, although I can't put my finger on
> > exactly the problematic scenario. My concern is that if we have a system
> > where the host kernel is entered at NS-EL1 (because, e.g. EL2 is used for
> > something else or the bootloader just didn't load us there) then the booting
> > protocol doesn't mandate a zero-initialised CNTVOFF value. If we can
> > subsequently end up using the physical counter in the kernel and the virtual
> > counter in userspace, the vDSO will get confused because the datapage values
> > will not correspond to the values it actually ends up reading. 
> 
> Just so I'm sure I'm understanding correctly, vDSO always reads the
> virtual counter?

Yes, that's right.

> > There's also
> > the likelihood that existing EL2 init code simply isn't setting up
> > CNTHCTL_EL2 and CNTVOFF correctly, so we probably need a way to force
> > virtual counter on the cmdline.
> 
> My understanding was that we only ever use the physical counter when we
> boot at EL2 and therefore the kernel is in control of CNTVOFF and can
> set that to 0.  Is this not the case, or are you asking for a way to
> mandate this relationship or make it abundantly clear?

AFAICT, arch_timer_ppi_nr could return ARCH_TIMER_PHYS_NONSECURE_PPI
or ARCH_TIMER_PHYS_SECURE_PPI when we're not running at EL2, which would
cause us to use the physical counter with your patch applied.

> Also, are you fine with arch_timer_read_counter changing to using the
> physical counter on arm64, but you're merely worried about
> read_cycles()?

Assuming you mean get_cycles(), then no, I'm not worried about that because
it's just used for things like small delta calculations and entropy. I'm
worried about the timekeeper (which I think uses arch_timer_read_counter)
being based on the physical counter and the vDSO being based on the virtual
counter and CNTVOFF != 0.

> Does it make sense to have read_cycles() still read the virtual counter
> whereas arch timers (timers, not counters) use the physical counter?  I
> somewhat convinced myself that this doesn't work though.

I don't think that solves the problem :(

> > In practice it looks like we always end up with ARCH_TIMER_VIRT_PPI out of
> > arch_timer_select_ppi, but that's not guaranteed and I haven't thought at
> > all about the 32-bit case, which has other quirks/complexities.
> > 
> 
> How does this change affect the 32-bit side?  All this should do is
> enable a static branch which is unused on the 32-bit side; what am I
> missing?

The PPI selection is more complicated for 32-bit, because of the
"arm,cpu-registers-not-fw-configured" quirk.

Will
diff mbox

Patch

diff --git a/arch/arm64/include/asm/arch_timer.h b/arch/arm64/include/asm/arch_timer.h
index ee5619b..48d522c 100644
--- a/arch/arm64/include/asm/arch_timer.h
+++ b/arch/arm64/include/asm/arch_timer.h
@@ -30,6 +30,8 @@ 
 
 #include <clocksource/arm_arch_timer.h>
 
+extern struct static_key_false arch_timer_phys_counter_available;
+
 #if IS_ENABLED(CONFIG_ARM_ARCH_TIMER_OOL_WORKAROUND)
 extern struct static_key_false arch_timer_read_ool_enabled;
 #define needs_unstable_timer_counter_workaround() \
@@ -52,6 +54,7 @@  struct arch_timer_erratum_workaround {
 	const char *desc;
 	u32 (*read_cntp_tval_el0)(void);
 	u32 (*read_cntv_tval_el0)(void);
+	u64 (*read_cntpct_el0)(void);
 	u64 (*read_cntvct_el0)(void);
 	int (*set_next_event_phys)(unsigned long, struct clock_event_device *);
 	int (*set_next_event_virt)(unsigned long, struct clock_event_device *);
@@ -148,13 +151,8 @@  static inline void arch_timer_set_cntkctl(u32 cntkctl)
 
 static inline u64 arch_counter_get_cntpct(void)
 {
-	u64 cval;
-	/*
-	 * AArch64 kernel and user space mandate the use of CNTVCT.
-	 */
 	isb();
-	asm volatile("mrs %0, cntpct_el0" : "=r" (cval));
-	return cval;
+	return arch_timer_reg_read_stable(cntpct_el0);
 }
 
 static inline u64 arch_counter_get_cntvct(void)
@@ -163,6 +161,14 @@  static inline u64 arch_counter_get_cntvct(void)
 	return arch_timer_reg_read_stable(cntvct_el0);
 }
 
+static inline u64 arch_counter_get_cycles(void)
+{
+	if (static_branch_unlikely(&arch_timer_phys_counter_available))
+	    return arch_counter_get_cntpct();
+	else
+	    return arch_counter_get_cntvct();
+}
+
 static inline int arch_timer_arch_init(void)
 {
 	return 0;
diff --git a/arch/arm64/include/asm/timex.h b/arch/arm64/include/asm/timex.h
index 81a076e..c0d214c 100644
--- a/arch/arm64/include/asm/timex.h
+++ b/arch/arm64/include/asm/timex.h
@@ -22,7 +22,7 @@ 
  * Use the current timer as a cycle counter since this is what we use for
  * the delay loop.
  */
-#define get_cycles()	arch_counter_get_cntvct()
+#define get_cycles()	arch_counter_get_cycles()
 
 #include <asm-generic/timex.h>
 
diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
index c24327c..f4e7261 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -77,6 +77,9 @@  static bool arch_timer_mem_use_virtual;
 static bool arch_counter_suspend_stop;
 static bool vdso_default = true;
 
+DEFINE_STATIC_KEY_FALSE(arch_timer_phys_counter_available);
+EXPORT_SYMBOL_GPL(arch_timer_phys_counter_available);
+
 static bool evtstrm_enable = IS_ENABLED(CONFIG_ARM_ARCH_TIMER_EVTSTREAM);
 
 static int __init early_evtstrm_cfg(char *buf)
@@ -217,6 +220,11 @@  static u32 notrace fsl_a008585_read_cntv_tval_el0(void)
 	return __fsl_a008585_read_reg(cntv_tval_el0);
 }
 
+static u64 notrace fsl_a008585_read_cntpct_el0(void)
+{
+	return __fsl_a008585_read_reg(cntpct_el0);
+}
+
 static u64 notrace fsl_a008585_read_cntvct_el0(void)
 {
 	return __fsl_a008585_read_reg(cntvct_el0);
@@ -258,6 +266,11 @@  static u32 notrace hisi_161010101_read_cntv_tval_el0(void)
 	return __hisi_161010101_read_reg(cntv_tval_el0);
 }
 
+static u64 notrace hisi_161010101_read_cntpct_el0(void)
+{
+	return __hisi_161010101_read_reg(cntpct_el0);
+}
+
 static u64 notrace hisi_161010101_read_cntvct_el0(void)
 {
 	return __hisi_161010101_read_reg(cntvct_el0);
@@ -288,6 +301,15 @@  static struct ate_acpi_oem_info hisi_161010101_oem_info[] = {
 #endif
 
 #ifdef CONFIG_ARM64_ERRATUM_858921
+static u64 notrace arm64_858921_read_cntpct_el0(void)
+{
+	u64 old, new;
+
+	old = read_sysreg(cntpct_el0);
+	new = read_sysreg(cntpct_el0);
+	return (((old ^ new) >> 32) & 1) ? old : new;
+}
+
 static u64 notrace arm64_858921_read_cntvct_el0(void)
 {
 	u64 old, new;
@@ -346,6 +368,7 @@  static const struct arch_timer_erratum_workaround ool_workarounds[] = {
 		.desc = "Freescale erratum a005858",
 		.read_cntp_tval_el0 = fsl_a008585_read_cntp_tval_el0,
 		.read_cntv_tval_el0 = fsl_a008585_read_cntv_tval_el0,
+		.read_cntpct_el0 = fsl_a008585_read_cntpct_el0,
 		.read_cntvct_el0 = fsl_a008585_read_cntvct_el0,
 		.set_next_event_phys = erratum_set_next_event_tval_phys,
 		.set_next_event_virt = erratum_set_next_event_tval_virt,
@@ -358,6 +381,7 @@  static const struct arch_timer_erratum_workaround ool_workarounds[] = {
 		.desc = "HiSilicon erratum 161010101",
 		.read_cntp_tval_el0 = hisi_161010101_read_cntp_tval_el0,
 		.read_cntv_tval_el0 = hisi_161010101_read_cntv_tval_el0,
+		.read_cntpct_el0 = hisi_161010101_read_cntpct_el0,
 		.read_cntvct_el0 = hisi_161010101_read_cntvct_el0,
 		.set_next_event_phys = erratum_set_next_event_tval_phys,
 		.set_next_event_virt = erratum_set_next_event_tval_virt,
@@ -368,6 +392,7 @@  static const struct arch_timer_erratum_workaround ool_workarounds[] = {
 		.desc = "HiSilicon erratum 161010101",
 		.read_cntp_tval_el0 = hisi_161010101_read_cntp_tval_el0,
 		.read_cntv_tval_el0 = hisi_161010101_read_cntv_tval_el0,
+		.read_cntpct_el0 = hisi_161010101_read_cntpct_el0,
 		.read_cntvct_el0 = hisi_161010101_read_cntvct_el0,
 		.set_next_event_phys = erratum_set_next_event_tval_phys,
 		.set_next_event_virt = erratum_set_next_event_tval_virt,
@@ -378,6 +403,7 @@  static const struct arch_timer_erratum_workaround ool_workarounds[] = {
 		.match_type = ate_match_local_cap_id,
 		.id = (void *)ARM64_WORKAROUND_858921,
 		.desc = "ARM erratum 858921",
+		.read_cntpct_el0 = arm64_858921_read_cntpct_el0,
 		.read_cntvct_el0 = arm64_858921_read_cntvct_el0,
 	},
 #endif
@@ -886,10 +912,12 @@  static void __init arch_counter_register(unsigned type)
 
 	/* Register the CP15 based counter if we have one */
 	if (type & ARCH_TIMER_TYPE_CP15) {
-		if (arch_timer_uses_ppi == ARCH_TIMER_VIRT_PPI)
+		if (arch_timer_uses_ppi == ARCH_TIMER_VIRT_PPI) {
 			arch_timer_read_counter = arch_counter_get_cntvct;
-		else
+		} else {
 			arch_timer_read_counter = arch_counter_get_cntpct;
+			static_branch_enable(&arch_timer_phys_counter_available);
+		}
 
 		clocksource_counter.archdata.vdso_direct = vdso_default;
 	} else {