diff mbox

Weird sched_clock behaviour during boot with -rc1

Message ID 20140204220045.GC20528@codeaurora.org (mailing list archive)
State New, archived
Headers show

Commit Message

Stephen Boyd Feb. 4, 2014, 10 p.m. UTC
On 02/04, John Stultz wrote:
> On 02/04/2014 10:36 AM, Will Deacon wrote:
> > Hi guys,
> >
> > Booting -rc1 on my TC2 gives the following strange entries in the dmesg:
> >
> >
> >   Uncompressing Linux... done, booting the kernel.
> >   [    0.000000] Booting Linux on physical CPU 0x0
> >
> >   [...]
> >
> >   [    0.000000]   HighMem zone: 329728 pages, LIFO batch:31
> >   [    7.789662] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 178956969942ns
> >   [    0.000129] PERCPU: Embedded 9 pages/cpu @ee7bd000 s12800 r8192 d15872 u36864
> >
> >   [...]
> >
> >   [    0.868297] NR_IRQS:16 nr_irqs:16 16
> >   [    0.886350] Architected cp15 timer(s) running at 24.00MHz (phys).
> >   [ 2915.164998] sched_clock: 56 bits at 24MHz, resolution 41ns, wraps every 2863311519744ns
> >   [    0.000002] Switching to timer-based delay loop
> >   [    0.014249] Console: colour dummy device 80x30
> >
> >
> > so it looks like something whacky goes on during sched_clock registration.
> > Sure enough, we're doing a pr_info in-between updating cs.* and calling
> > update_sched_clock(), so moving the print sorts things out (diff below).
> 
> Yea... we have to be particularly careful with sched_clock to avoid
> locks since we don't want to deadlock, but in this case
> sched_clock_register is a little too relaxed here.
> 
> Stephen: Would it make sense to set cd.suspended = true at the top of
> the registration? That should block any sched_clock calls from getting
> half-updated data, but still allow the sched_clock_update function to work.
> 

That would work, but why can't we just hold the write seqlock
during the registration? We would need to make a lockeless
version of update_sched_clock() but that doesn't look too hard.
It might actually turn out nicer because we call
update_sched_clock() here just to set the epoch_cyc but we have
to reset the epoch_ns back to 0 to start the count off right.

How about this? The only concern is calling read_sched_clock()
inside the seqlock, but I don't think that's a concern and if it
is we can call it outside the lock at the beginning of this
function.

----8<----

Comments

John Stultz Feb. 7, 2014, 6:23 p.m. UTC | #1
On 02/04/2014 02:00 PM, Stephen Boyd wrote:
> On 02/04, John Stultz wrote:
>> On 02/04/2014 10:36 AM, Will Deacon wrote:
>>> Hi guys,
>>>
>>> Booting -rc1 on my TC2 gives the following strange entries in the dmesg:
>>>
>>>
>>>   Uncompressing Linux... done, booting the kernel.
>>>   [    0.000000] Booting Linux on physical CPU 0x0
>>>
>>>   [...]
>>>
>>>   [    0.000000]   HighMem zone: 329728 pages, LIFO batch:31
>>>   [    7.789662] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 178956969942ns
>>>   [    0.000129] PERCPU: Embedded 9 pages/cpu @ee7bd000 s12800 r8192 d15872 u36864
>>>
>>>   [...]
>>>
>>>   [    0.868297] NR_IRQS:16 nr_irqs:16 16
>>>   [    0.886350] Architected cp15 timer(s) running at 24.00MHz (phys).
>>>   [ 2915.164998] sched_clock: 56 bits at 24MHz, resolution 41ns, wraps every 2863311519744ns
>>>   [    0.000002] Switching to timer-based delay loop
>>>   [    0.014249] Console: colour dummy device 80x30
>>>
>>>
>>> so it looks like something whacky goes on during sched_clock registration.
>>> Sure enough, we're doing a pr_info in-between updating cs.* and calling
>>> update_sched_clock(), so moving the print sorts things out (diff below).
>> Yea... we have to be particularly careful with sched_clock to avoid
>> locks since we don't want to deadlock, but in this case
>> sched_clock_register is a little too relaxed here.
>>
>> Stephen: Would it make sense to set cd.suspended = true at the top of
>> the registration? That should block any sched_clock calls from getting
>> half-updated data, but still allow the sched_clock_update function to work.
>>
> That would work, but why can't we just hold the write seqlock
> during the registration? We would need to make a lockeless
> version of update_sched_clock() but that doesn't look too hard.
> It might actually turn out nicer because we call
> update_sched_clock() here just to set the epoch_cyc but we have
> to reset the epoch_ns back to 0 to start the count off right.
>
> How about this? The only concern is calling read_sched_clock()
> inside the seqlock, but I don't think that's a concern and if it
> is we can call it outside the lock at the beginning of this
> function.

Hey Stephen,

So whats the story here? Are we waiting on an ack from Will or would you
rather go with Josh's approach?

thanks
-john
Stephen Boyd Feb. 7, 2014, 7:37 p.m. UTC | #2
On 02/07, John Stultz wrote:
> On 02/04/2014 02:00 PM, Stephen Boyd wrote:
> >>
> > That would work, but why can't we just hold the write seqlock
> > during the registration? We would need to make a lockeless
> > version of update_sched_clock() but that doesn't look too hard.
> > It might actually turn out nicer because we call
> > update_sched_clock() here just to set the epoch_cyc but we have
> > to reset the epoch_ns back to 0 to start the count off right.
> >
> > How about this? The only concern is calling read_sched_clock()
> > inside the seqlock, but I don't think that's a concern and if it
> > is we can call it outside the lock at the beginning of this
> > function.
> 
> So whats the story here? Are we waiting on an ack from Will or would you
> rather go with Josh's approach?

An acked-by/tested-by from Will would be good. I'll cook up a
patch right now to do everything that has been mentioned in this
thread.
diff mbox

Patch

diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index 0abb36464281..ae8f10a02c32 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -124,6 +124,7 @@  void __init sched_clock_register(u64 (*read)(void), int bits,
 		return;
 
 	WARN_ON(!irqs_disabled());
+	raw_write_seqcount_begin(&cd.seq);
 	read_sched_clock = read;
 	sched_clock_mask = CLOCKSOURCE_MASK(bits);
 	cd.rate = rate;
@@ -147,15 +148,16 @@  void __init sched_clock_register(u64 (*read)(void), int bits,
 
 	/* calculate the ns resolution of this counter */
 	res = cyc_to_ns(1ULL, cd.mult, cd.shift);
-	pr_info("sched_clock: %u bits at %lu%cHz, resolution %lluns, wraps every %lluns\n",
-		bits, r, r_unit, res, wrap);
-
-	update_sched_clock();
 
 	/*
 	 * Ensure that sched_clock() starts off at 0ns
 	 */
 	cd.epoch_ns = 0;
+	cd.epoch_cyc = read_sched_clock();
+	raw_write_seqcount_end(&cd.seq);
+
+	pr_info("sched_clock: %u bits at %lu%cHz, resolution %lluns, wraps every %lluns\n",
+		bits, r, r_unit, res, wrap);
 
 	/* Enable IRQ time accounting if we have a fast enough sched_clock */