diff mbox series

[RESEND] clocksource/arm_arch_timer: Fix masking for high freq counters

Message ID 20210806182126.2842876-1-oupton@google.com (mailing list archive)
State New, archived
Headers show
Series [RESEND] clocksource/arm_arch_timer: Fix masking for high freq counters | expand

Commit Message

Oliver Upton Aug. 6, 2021, 6:21 p.m. UTC
Unfortunately, the architecture provides no means to determine the bit
width of the system counter. However, we do know the following from the
specification:

 - the system counter is at least 56 bits wide
 - Roll-over time of not less than 40 years

To date, the arch timer driver has depended on the first property,
assuming any system counter to be 56 bits wide and masking off the rest.
However, combining a narrow clocksource mask with a high frequency
counter could result in prematurely wrapping the system counter by a
significant margin. For example, a 56 bit wide, 1GHz system counter
would wrap in a mere 2.28 years!

This is a problem for two reasons: v8.6+ implementations are required to
provide a 64 bit, 1GHz system counter. Furthermore, before v8.6,
implementers may select a counter frequency of their choosing.

Fix the issue by deriving a valid clock mask based on the second
property from above. Set the floor at 56 bits, since we know no system
counter is narrower than that.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Oliver Upton <oupton@google.com>
---
This patch was tested with QEMU, tweaked to provide a 1GHz system
counter frequency, as I could not easily figure out how to tweak the
base FVP to provide a 1GHz counter.

Parent commit: 0c32706dac1b ("arm64: stacktrace: avoid tracing arch_stack_walk()")

 drivers/clocksource/arm_arch_timer.c | 28 ++++++++++++++++++++++++++--
 1 file changed, 26 insertions(+), 2 deletions(-)

Comments

Marc Zyngier Aug. 7, 2021, 10:52 a.m. UTC | #1
Hi Oliver,

On Fri, 06 Aug 2021 19:21:26 +0100,
Oliver Upton <oupton@google.com> wrote:
> 
> Unfortunately, the architecture provides no means to determine the bit
> width of the system counter. However, we do know the following from the
> specification:
> 
>  - the system counter is at least 56 bits wide
>  - Roll-over time of not less than 40 years
> 
> To date, the arch timer driver has depended on the first property,
> assuming any system counter to be 56 bits wide and masking off the rest.
> However, combining a narrow clocksource mask with a high frequency
> counter could result in prematurely wrapping the system counter by a
> significant margin. For example, a 56 bit wide, 1GHz system counter
> would wrap in a mere 2.28 years!
> 
> This is a problem for two reasons: v8.6+ implementations are required to
> provide a 64 bit, 1GHz system counter. Furthermore, before v8.6,
> implementers may select a counter frequency of their choosing.
> 
> Fix the issue by deriving a valid clock mask based on the second
> property from above. Set the floor at 56 bits, since we know no system
> counter is narrower than that.
> 
> Suggested-by: Marc Zyngier <maz@kernel.org>
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
> This patch was tested with QEMU, tweaked to provide a 1GHz system
> counter frequency, as I could not easily figure out how to tweak the
> base FVP to provide a 1GHz counter.

<FVP>
"bp.refcounter.base_frequency" is the property you are looking for. In
general, passing --list-params to the model reveals a treasure trove
of weird and wonderful options that can be used to configure the model
to your liking.
</FVP>

> 
> Parent commit: 0c32706dac1b ("arm64: stacktrace: avoid tracing arch_stack_walk()")
> 
>  drivers/clocksource/arm_arch_timer.c | 28 ++++++++++++++++++++++++++--
>  1 file changed, 26 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
> index be6d741d404c..8c41626a4c8a 100644
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -52,6 +52,12 @@
>  #define CNTV_TVAL	0x38
>  #define CNTV_CTL	0x3c
>  
> +/*
> + * The minimum amount of time a generic timer is guaranteed to not roll over

nit: s/timer/counter/

> + * (40 years)

For later reference, could you add the section of the ARMv8 ARM where
this is mentioned? Something like 'ARM DDI 0487G.a D11.1.2 ("The
system counter")', either here or in the comment further down.

> + */
> +#define MIN_ROLLOVER_SECS	(40ULL * 365 * 24 * 3600)
> +
>  static unsigned arch_timers_present __initdata;
>  
>  static void __iomem *arch_counter_base __ro_after_init;
> @@ -1004,9 +1010,24 @@ struct arch_timer_kvm_info *arch_timer_get_kvm_info(void)
>  	return &arch_timer_kvm_info;
>  }
>  
> +/*
> + * Makes an educated guess at a valid counter width based on the Generic Timer
> + * specification. Of note:
> + *   1) the Generic Timer is at least 56 bits wide
> + *   2) a roll-over time of not less than 40 years
> + */
> +static int __init arch_counter_get_width(void)
> +{
> +	u64 min_cycles = MIN_ROLLOVER_SECS * arch_timer_get_cntfrq();

That's unfortunately wishful thinking. We have stupidly broken systems
out there that do not set CNTFRQ_EL0, and instead rely on DT
properties to describe the counter frequency. You're likely to end up
with a glorious zero as a result, with interesting consequences...

Use arch_timer_rate instead, which will be set as by the time you
register the counter.

> +
> +	/* guarantee the returned width is within the valid range */
> +	return max(56, min(64, ilog2(min_cycles)));

Maybe better written as "clamp_val(ilog2(min_cycles), 56, 64);".

> +}
> +
>  static void __init arch_counter_register(unsigned type)
>  {
>  	u64 start_count;
> +	int width;
>  
>  	/* Register the CP15 based counter if we have one */
>  	if (type & ARCH_TIMER_TYPE_CP15) {
> @@ -1031,6 +1052,10 @@ static void __init arch_counter_register(unsigned type)
>  		arch_timer_read_counter = arch_counter_get_cntvct_mem;
>  	}
>  
> +	width = arch_counter_get_width();
> +	clocksource_counter.mask = CLOCKSOURCE_MASK(width);
> +	cyclecounter.mask = CLOCKSOURCE_MASK(width);

Since you move this to be computed at runtime, how about dropping the
static initialisation of the mask fields?

> +
>  	if (!arch_counter_suspend_stop)
>  		clocksource_counter.flags |= CLOCK_SOURCE_SUSPEND_NONSTOP;
>  	start_count = arch_timer_read_counter();
> @@ -1040,8 +1065,7 @@ static void __init arch_counter_register(unsigned type)
>  	timecounter_init(&arch_timer_kvm_info.timecounter,
>  			 &cyclecounter, start_count);
>  
> -	/* 56 bits minimum, so we assume worst case rollover */
> -	sched_clock_register(arch_timer_read_counter, 56, arch_timer_rate);
> +	sched_clock_register(arch_timer_read_counter, width, arch_timer_rate);
>  }
>  
>  static void arch_timer_stop(struct clock_event_device *clk)

Another thing that needs addressing for high frequency counter support
is to move away from TVAL programming and switch to CVAL, as the
maximum deadline we can currently program is 4.2s at 1GHz.

Fun fact: it has the interesting consequence of breaking XGene-1,
which implemented CVAL in terms of TVAL instead of the other way
around (what were these guys thinking?), though I don't think anyone
will notice in practice. I have a preliminary patch on a branch
somewhere that I'll try to dust up and post in the coming days.

Thanks,

	M.
Oliver Upton Aug. 7, 2021, 6:35 p.m. UTC | #2
Hi Marc,

On Sat, Aug 7, 2021 at 3:52 AM Marc Zyngier <maz@kernel.org> wrote:
>
> Hi Oliver,
>
> On Fri, 06 Aug 2021 19:21:26 +0100,
> Oliver Upton <oupton@google.com> wrote:
> >
> > Unfortunately, the architecture provides no means to determine the bit
> > width of the system counter. However, we do know the following from the
> > specification:
> >
> >  - the system counter is at least 56 bits wide
> >  - Roll-over time of not less than 40 years
> >
> > To date, the arch timer driver has depended on the first property,
> > assuming any system counter to be 56 bits wide and masking off the rest.
> > However, combining a narrow clocksource mask with a high frequency
> > counter could result in prematurely wrapping the system counter by a
> > significant margin. For example, a 56 bit wide, 1GHz system counter
> > would wrap in a mere 2.28 years!
> >
> > This is a problem for two reasons: v8.6+ implementations are required to
> > provide a 64 bit, 1GHz system counter. Furthermore, before v8.6,
> > implementers may select a counter frequency of their choosing.
> >
> > Fix the issue by deriving a valid clock mask based on the second
> > property from above. Set the floor at 56 bits, since we know no system
> > counter is narrower than that.
> >
> > Suggested-by: Marc Zyngier <maz@kernel.org>
> > Signed-off-by: Oliver Upton <oupton@google.com>
> > ---
> > This patch was tested with QEMU, tweaked to provide a 1GHz system
> > counter frequency, as I could not easily figure out how to tweak the
> > base FVP to provide a 1GHz counter.
>
> <FVP>
> "bp.refcounter.base_frequency" is the property you are looking for. In
> general, passing --list-params to the model reveals a treasure trove
> of weird and wonderful options that can be used to configure the model
> to your liking.
> </FVP>

Ah! Thanks for the tip. I didn't find the fast models documentation to
be too terribly helpful.
> >
> > +/*
> > + * The minimum amount of time a generic timer is guaranteed to not roll over
>
> nit: s/timer/counter/
>

Ack.

> > + * (40 years)
>
> For later reference, could you add the section of the ARMv8 ARM where
> this is mentioned? Something like 'ARM DDI 0487G.a D11.1.2 ("The
> system counter")', either here or in the comment further down.
>

Sure thing. I think I'll drop it on the function below since we depend
on two properties of the system counter there.

> > +static int __init arch_counter_get_width(void)
> > +{
> > +     u64 min_cycles = MIN_ROLLOVER_SECS * arch_timer_get_cntfrq();
>
> That's unfortunately wishful thinking. We have stupidly broken systems
> out there that do not set CNTFRQ_EL0, and instead rely on DT
> properties to describe the counter frequency. You're likely to end up
> with a glorious zero as a result, with interesting consequences...

Blech. I didn't read the DT hooks for the driver.  That's awful.

> Use arch_timer_rate instead, which will be set as by the time you
> register the counter.

Ack.

>
> > +
> > +     /* guarantee the returned width is within the valid range */
> > +     return max(56, min(64, ilog2(min_cycles)));
>
> Maybe better written as "clamp_val(ilog2(min_cycles), 56, 64);".
>

Agreed.

> > +     width = arch_counter_get_width();
> > +     clocksource_counter.mask = CLOCKSOURCE_MASK(width);
> > +     cyclecounter.mask = CLOCKSOURCE_MASK(width);
>
> Since you move this to be computed at runtime, how about dropping the
> static initialisation of the mask fields?
>

Yep, already made the change locally after mailing out :)

> > +
> >       if (!arch_counter_suspend_stop)
> >               clocksource_counter.flags |= CLOCK_SOURCE_SUSPEND_NONSTOP;
> >       start_count = arch_timer_read_counter();
> > @@ -1040,8 +1065,7 @@ static void __init arch_counter_register(unsigned type)
> >       timecounter_init(&arch_timer_kvm_info.timecounter,
> >                        &cyclecounter, start_count);
> >
> > -     /* 56 bits minimum, so we assume worst case rollover */
> > -     sched_clock_register(arch_timer_read_counter, 56, arch_timer_rate);
> > +     sched_clock_register(arch_timer_read_counter, width, arch_timer_rate);
> >  }
> >
> >  static void arch_timer_stop(struct clock_event_device *clk)
>
> Another thing that needs addressing for high frequency counter support
> is to move away from TVAL programming and switch to CVAL, as the
> maximum deadline we can currently program is 4.2s at 1GHz.
>
> Fun fact: it has the interesting consequence of breaking XGene-1,
> which implemented CVAL in terms of TVAL instead of the other way
> around (what were these guys thinking?)

Yikes!

> , though I don't think anyone
> will notice in practice. I have a preliminary patch on a branch
> somewhere that I'll try to dust up and post in the coming days.

Sounds good, I'll clean this one up and send it out too.

--
Thanks,
Oliver
diff mbox series

Patch

diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
index be6d741d404c..8c41626a4c8a 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -52,6 +52,12 @@ 
 #define CNTV_TVAL	0x38
 #define CNTV_CTL	0x3c
 
+/*
+ * The minimum amount of time a generic timer is guaranteed to not roll over
+ * (40 years)
+ */
+#define MIN_ROLLOVER_SECS	(40ULL * 365 * 24 * 3600)
+
 static unsigned arch_timers_present __initdata;
 
 static void __iomem *arch_counter_base __ro_after_init;
@@ -1004,9 +1010,24 @@  struct arch_timer_kvm_info *arch_timer_get_kvm_info(void)
 	return &arch_timer_kvm_info;
 }
 
+/*
+ * Makes an educated guess at a valid counter width based on the Generic Timer
+ * specification. Of note:
+ *   1) the Generic Timer is at least 56 bits wide
+ *   2) a roll-over time of not less than 40 years
+ */
+static int __init arch_counter_get_width(void)
+{
+	u64 min_cycles = MIN_ROLLOVER_SECS * arch_timer_get_cntfrq();
+
+	/* guarantee the returned width is within the valid range */
+	return max(56, min(64, ilog2(min_cycles)));
+}
+
 static void __init arch_counter_register(unsigned type)
 {
 	u64 start_count;
+	int width;
 
 	/* Register the CP15 based counter if we have one */
 	if (type & ARCH_TIMER_TYPE_CP15) {
@@ -1031,6 +1052,10 @@  static void __init arch_counter_register(unsigned type)
 		arch_timer_read_counter = arch_counter_get_cntvct_mem;
 	}
 
+	width = arch_counter_get_width();
+	clocksource_counter.mask = CLOCKSOURCE_MASK(width);
+	cyclecounter.mask = CLOCKSOURCE_MASK(width);
+
 	if (!arch_counter_suspend_stop)
 		clocksource_counter.flags |= CLOCK_SOURCE_SUSPEND_NONSTOP;
 	start_count = arch_timer_read_counter();
@@ -1040,8 +1065,7 @@  static void __init arch_counter_register(unsigned type)
 	timecounter_init(&arch_timer_kvm_info.timecounter,
 			 &cyclecounter, start_count);
 
-	/* 56 bits minimum, so we assume worst case rollover */
-	sched_clock_register(arch_timer_read_counter, 56, arch_timer_rate);
+	sched_clock_register(arch_timer_read_counter, width, arch_timer_rate);
 }
 
 static void arch_timer_stop(struct clock_event_device *clk)