diff mbox

sched: sched_clock() clocksource handling.

Message ID 20090602075409.GA19294@linux-sh.org (mailing list archive)
State Superseded
Headers show

Commit Message

Paul Mundt June 2, 2009, 7:54 a.m. UTC
On Tue, Jun 02, 2009 at 09:41:35AM +0200, Peter Zijlstra wrote:
> On Tue, 2009-06-02 at 16:35 +0900, Paul Mundt wrote:
> > 
> > We already do via select_clocksource(), if we are unregistering the
> > current one then a new one with the flag set is selected. Before that,
> > the override is likewise given preference, and we fall back on jiffies if
> > there is nothing else. I suppose we could try and find the "best" one,
> > but I think the override and manual clocksource selection should be fine
> > for this.
> 
> Ah, ok. So unregister calls select_clocksource again? That does leave us
> a small window with jiffies, but I guess that's ok.
> 
A synchronize_rcu() would fix that up, but I think a small window with
jiffies is less painful than sorting out RCU ordering and synchronization
for a corner case of a corner case ;-)

> > Now that you mention it though, the sched_clocksource() assignment within
> > select_clocksource() happens underneath the clocksource_lock, but is not
> > using rcu_assign_pointer().
> 
> Right, that would want fixing indeed.
> 
> >  If the assignment there needs to use
> > rcu_assign_pointer() then presumably all of the unlock paths that do
> > select_clocksource() will have to synchronize_rcu()?
> 
> No, you only have to do sync_rcu() when stuff that could have referenced
> is going away and you cannot use call_rcu().
> 
> So when selecting a new clocksource, you don't need synchonization
> because stuff doesn't go away (I think :-)

Ok, that keeps things more simplified then. How does this look?

---

 include/linux/clocksource.h |    4 +++-
 kernel/sched_clock.c        |   13 +++++++++++--
 kernel/time/clocksource.c   |   19 +++++++++++++++++++
 kernel/time/jiffies.c       |    2 +-
 4 files changed, 34 insertions(+), 4 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Peter Zijlstra June 2, 2009, 8 a.m. UTC | #1
On Tue, 2009-06-02 at 16:54 +0900, Paul Mundt wrote:
> On Tue, Jun 02, 2009 at 09:41:35AM +0200, Peter Zijlstra wrote:
> > On Tue, 2009-06-02 at 16:35 +0900, Paul Mundt wrote:
> > > 
> > > We already do via select_clocksource(), if we are unregistering the
> > > current one then a new one with the flag set is selected. Before that,
> > > the override is likewise given preference, and we fall back on jiffies if
> > > there is nothing else. I suppose we could try and find the "best" one,
> > > but I think the override and manual clocksource selection should be fine
> > > for this.
> > 
> > Ah, ok. So unregister calls select_clocksource again? That does leave us
> > a small window with jiffies, but I guess that's ok.
> > 
> A synchronize_rcu() would fix that up, but I think a small window with
> jiffies is less painful than sorting out RCU ordering and synchronization
> for a corner case of a corner case ;-)
> 
> > > Now that you mention it though, the sched_clocksource() assignment within
> > > select_clocksource() happens underneath the clocksource_lock, but is not
> > > using rcu_assign_pointer().
> > 
> > Right, that would want fixing indeed.
> > 
> > >  If the assignment there needs to use
> > > rcu_assign_pointer() then presumably all of the unlock paths that do
> > > select_clocksource() will have to synchronize_rcu()?
> > 
> > No, you only have to do sync_rcu() when stuff that could have referenced
> > is going away and you cannot use call_rcu().
> > 
> > So when selecting a new clocksource, you don't need synchonization
> > because stuff doesn't go away (I think :-)
> 
> Ok, that keeps things more simplified then. How does this look?


Looks fine to me,

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul Mundt June 2, 2009, 8 a.m. UTC | #2
On Tue, Jun 02, 2009 at 10:00:03AM +0200, Peter Zijlstra wrote:
> On Tue, 2009-06-02 at 16:54 +0900, Paul Mundt wrote:
> > On Tue, Jun 02, 2009 at 09:41:35AM +0200, Peter Zijlstra wrote:
> > > On Tue, 2009-06-02 at 16:35 +0900, Paul Mundt wrote:
> > > > 
> > > > We already do via select_clocksource(), if we are unregistering the
> > > > current one then a new one with the flag set is selected. Before that,
> > > > the override is likewise given preference, and we fall back on jiffies if
> > > > there is nothing else. I suppose we could try and find the "best" one,
> > > > but I think the override and manual clocksource selection should be fine
> > > > for this.
> > > 
> > > Ah, ok. So unregister calls select_clocksource again? That does leave us
> > > a small window with jiffies, but I guess that's ok.
> > > 
> > A synchronize_rcu() would fix that up, but I think a small window with
> > jiffies is less painful than sorting out RCU ordering and synchronization
> > for a corner case of a corner case ;-)
> > 
> > > > Now that you mention it though, the sched_clocksource() assignment within
> > > > select_clocksource() happens underneath the clocksource_lock, but is not
> > > > using rcu_assign_pointer().
> > > 
> > > Right, that would want fixing indeed.
> > > 
> > > >  If the assignment there needs to use
> > > > rcu_assign_pointer() then presumably all of the unlock paths that do
> > > > select_clocksource() will have to synchronize_rcu()?
> > > 
> > > No, you only have to do sync_rcu() when stuff that could have referenced
> > > is going away and you cannot use call_rcu().
> > > 
> > > So when selecting a new clocksource, you don't need synchonization
> > > because stuff doesn't go away (I think :-)
> > 
> > Ok, that keeps things more simplified then. How does this look?
> 
> 
> Looks fine to me,
> 
> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>

Thanks for the help!
--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Walker June 2, 2009, 11:49 a.m. UTC | #3
On Tue, 2009-06-02 at 16:54 +0900, Paul Mundt wrote:
>  unsigned long long __attribute__((weak)) sched_clock(void)
>  {
> -       return (unsigned long long)(jiffies - INITIAL_JIFFIES)
> -                                       * (NSEC_PER_SEC / HZ);
> +       unsigned long long time;
> +       struct clocksource *clock;
> +
> +       rcu_read_lock();
> +       clock = rcu_dereference(sched_clocksource);
> +       time = cyc2ns(clock, clocksource_read(clock));
> +       rcu_read_unlock();
> +
> +       return time;
>  }

My concerns with the locking here still stand. Nothing you've said or
done bolsters the clocksource in modules argument. I think what your
planning for sh clocksources seems very inelegant. I would imagine a
better solution is out there. I'd prefer if you just leave sched_clock
alone.

Daniel

--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra June 2, 2009, 12:26 p.m. UTC | #4
On Tue, 2009-06-02 at 16:54 +0900, Paul Mundt wrote:
> @@ -212,6 +213,7 @@ extern struct clocksource *clock;   /* current clocksource */
>  
>  #define CLOCK_SOURCE_WATCHDOG                  0x10
>  #define CLOCK_SOURCE_VALID_FOR_HRES            0x20
> +#define CLOCK_SOURCE_USE_FOR_SCHED_CLOCK       0x40

One more thing though, as John suggested, this bit might want a proper
comment explaining when a clocksource should and should not set this
bit.
--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Gleixner June 2, 2009, 8:21 p.m. UTC | #5
On Tue, 2 Jun 2009, Daniel Walker wrote:
> On Tue, 2009-06-02 at 16:54 +0900, Paul Mundt wrote:
> >  unsigned long long __attribute__((weak)) sched_clock(void)
> >  {
> > -       return (unsigned long long)(jiffies - INITIAL_JIFFIES)
> > -                                       * (NSEC_PER_SEC / HZ);
> > +       unsigned long long time;
> > +       struct clocksource *clock;
> > +
> > +       rcu_read_lock();
> > +       clock = rcu_dereference(sched_clocksource);
> > +       time = cyc2ns(clock, clocksource_read(clock));
> > +       rcu_read_unlock();
> > +
> > +       return time;
> >  }
> 
> My concerns with the locking here still stand. Nothing you've said or
> done bolsters the clocksource in modules argument. I think what your

Can you please stop to belabor modules? Again, it is totally
_irrelevant_ whether the clocksource is in a module or not.

unregister_clocksource() can be called from compiled in code as well
to replace a clocksource which is physically shut down. We have to
deal with that modules or not.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul Mundt June 3, 2009, 3:36 a.m. UTC | #6
On Tue, Jun 02, 2009 at 04:49:26AM -0700, Daniel Walker wrote:
> On Tue, 2009-06-02 at 16:54 +0900, Paul Mundt wrote:
> >  unsigned long long __attribute__((weak)) sched_clock(void)
> >  {
> > -       return (unsigned long long)(jiffies - INITIAL_JIFFIES)
> > -                                       * (NSEC_PER_SEC / HZ);
> > +       unsigned long long time;
> > +       struct clocksource *clock;
> > +
> > +       rcu_read_lock();
> > +       clock = rcu_dereference(sched_clocksource);
> > +       time = cyc2ns(clock, clocksource_read(clock));
> > +       rcu_read_unlock();
> > +
> > +       return time;
> >  }
> 
> My concerns with the locking here still stand. Nothing you've said or
> done bolsters the clocksource in modules argument. I think what your
> planning for sh clocksources seems very inelegant. I would imagine a
> better solution is out there. I'd prefer if you just leave sched_clock
> alone.
> 
This is the first I've heard you mention locking concerns, and as usual
there is not enough technical content (or any, really) to go on to even
reply to this. Whether you consider my solution for sh clocksources
elegant or not is irrelevant, as I wasn't soliciting feedback, and it's a
problem that has to be dealt with regardless of whether it's a pretty one
or not.

If at such a time you wish to post something bordering on a real
technical concern, we can continue this thread of conversation, until
then I'll be sure to drop you from future versions of the patch. If you
want to hand-wave, do it somewhere else, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Walker June 3, 2009, 2:58 p.m. UTC | #7
On Wed, 2009-06-03 at 12:36 +0900, Paul Mundt wrote:
> On Tue, Jun 02, 2009 at 04:49:26AM -0700, Daniel Walker wrote:
> > On Tue, 2009-06-02 at 16:54 +0900, Paul Mundt wrote:
> > >  unsigned long long __attribute__((weak)) sched_clock(void)
> > >  {
> > > -       return (unsigned long long)(jiffies - INITIAL_JIFFIES)
> > > -                                       * (NSEC_PER_SEC / HZ);
> > > +       unsigned long long time;
> > > +       struct clocksource *clock;
> > > +
> > > +       rcu_read_lock();
> > > +       clock = rcu_dereference(sched_clocksource);
> > > +       time = cyc2ns(clock, clocksource_read(clock));
> > > +       rcu_read_unlock();
> > > +
> > > +       return time;
> > >  }
> > 
> > My concerns with the locking here still stand. Nothing you've said or
> > done bolsters the clocksource in modules argument. I think what your
> > planning for sh clocksources seems very inelegant. I would imagine a
> > better solution is out there. I'd prefer if you just leave sched_clock
> > alone.
> > 
> This is the first I've heard you mention locking concerns, and as usual
> there is not enough technical content (or any, really) to go on to even
> reply to this. Whether you consider my solution for sh clocksources
> elegant or not is irrelevant, as I wasn't soliciting feedback, and it's a
> problem that has to be dealt with regardless of whether it's a pretty one
> or not.

I think maybe your not reading my emails .. I said there is unnessesary
locking in sched_clock, and that it's fixing a problem that doesn't
exist (i.e. clocksources in modules) .. You claim you want clocksources
in modules because you have a useful case in sh, which I consider not
very useful .. And your refusing to remove the locking, or that's how it
seems. So I'd prefer you don't submit this code any longer. I don't
think you know what your doing in this case.

This is not hand waving, and it is technical.

Daniel

--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index c56457c..2109940 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -202,7 +202,8 @@  struct clocksource {
 #endif
 };
 
-extern struct clocksource *clock;	/* current clocksource */
+extern struct clocksource *clock;		/* current clocksource */
+extern struct clocksource *sched_clocksource;	/* sched_clock() clocksource */
 
 /*
  * Clock source flags bits::
@@ -212,6 +213,7 @@  extern struct clocksource *clock;	/* current clocksource */
 
 #define CLOCK_SOURCE_WATCHDOG			0x10
 #define CLOCK_SOURCE_VALID_FOR_HRES		0x20
+#define CLOCK_SOURCE_USE_FOR_SCHED_CLOCK	0x40
 
 /* simplify initialization of mask field */
 #define CLOCKSOURCE_MASK(bits) (cycle_t)((bits) < 64 ? ((1ULL<<(bits))-1) : -1)
diff --git a/kernel/sched_clock.c b/kernel/sched_clock.c
index e1d16c9..b51d48d 100644
--- a/kernel/sched_clock.c
+++ b/kernel/sched_clock.c
@@ -30,6 +30,8 @@ 
 #include <linux/percpu.h>
 #include <linux/ktime.h>
 #include <linux/sched.h>
+#include <linux/clocksource.h>
+#include <linux/rcupdate.h>
 
 /*
  * Scheduler clock - returns current time in nanosec units.
@@ -38,8 +40,15 @@ 
  */
 unsigned long long __attribute__((weak)) sched_clock(void)
 {
-	return (unsigned long long)(jiffies - INITIAL_JIFFIES)
-					* (NSEC_PER_SEC / HZ);
+	unsigned long long time;
+	struct clocksource *clock;
+
+	rcu_read_lock();
+	clock = rcu_dereference(sched_clocksource);
+	time = cyc2ns(clock, clocksource_read(clock));
+	rcu_read_unlock();
+
+	return time;
 }
 
 static __read_mostly int sched_clock_running;
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 80189f6..f7243f2 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -25,6 +25,7 @@ 
  */
 
 #include <linux/clocksource.h>
+#include <linux/rcupdate.h>
 #include <linux/sysdev.h>
 #include <linux/init.h>
 #include <linux/module.h>
@@ -109,6 +110,7 @@  EXPORT_SYMBOL(timecounter_cyc2time);
 
 /* XXX - Would like a better way for initializing curr_clocksource */
 extern struct clocksource clocksource_jiffies;
+struct clocksource *sched_clocksource = &clocksource_jiffies;
 
 /*[Clocksource internal variables]---------
  * curr_clocksource:
@@ -362,6 +364,9 @@  static struct clocksource *select_clocksource(void)
 	if (next == curr_clocksource)
 		return NULL;
 
+	if (next->flags & CLOCK_SOURCE_USE_FOR_SCHED_CLOCK)
+		rcu_assign_pointer(sched_clocksource, next);
+
 	return next;
 }
 
@@ -440,7 +445,21 @@  void clocksource_unregister(struct clocksource *cs)
 	list_del(&cs->list);
 	if (clocksource_override == cs)
 		clocksource_override = NULL;
+
 	next_clocksource = select_clocksource();
+
+	/*
+	 * If select_clocksource() fails to find another suitable
+	 * clocksource for sched_clocksource and we are unregistering
+	 * it, switch back to jiffies.
+	 */
+	if (sched_clocksource == cs) {
+		rcu_assign_pointer(sched_clocksource, &clocksource_jiffies);
+		spin_unlock_irqrestore(&clocksource_lock, flags);
+		synchronize_rcu();
+		return;
+	}
+
 	spin_unlock_irqrestore(&clocksource_lock, flags);
 }
 
diff --git a/kernel/time/jiffies.c b/kernel/time/jiffies.c
index c3f6c30..727d881 100644
--- a/kernel/time/jiffies.c
+++ b/kernel/time/jiffies.c
@@ -52,7 +52,7 @@ 
 
 static cycle_t jiffies_read(struct clocksource *cs)
 {
-	return (cycle_t) jiffies;
+	return (cycle_t) (jiffies - INITIAL_JIFFIES);
 }
 
 struct clocksource clocksource_jiffies = {