diff mbox series

[RFC,01/13] arm: Fix mutual exclusion in arch_gettimeoffset

Message ID d1559db8eddb5b06747b152ae69b63c3aa66ed09.1541995959.git.fthain@telegraphics.com.au (mailing list archive)
State RFC
Headers show
Series [RFC,01/13] arm: Fix mutual exclusion in arch_gettimeoffset | expand

Commit Message

Finn Thain Nov. 12, 2018, 4:12 a.m. UTC
Implementations of arch_gettimeoffset are generally not re-entrant
and assume that interrupts have been disabled. Unfortunately this
pre-condition got broken in v2.6.32.

Fixes: 5cfc8ee0bb51 ("ARM: convert arm to arch_gettimeoffset()")
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
---
 arch/arm/mach-ebsa110/core.c | 5 +++++
 arch/arm/mach-rpc/time.c     | 5 +++++
 2 files changed, 10 insertions(+)

Comments

Christoph Hellwig Nov. 12, 2018, 8:34 a.m. UTC | #1
On Mon, Nov 12, 2018 at 03:12:39PM +1100, Finn Thain wrote:
> Implementations of arch_gettimeoffset are generally not re-entrant
> and assume that interrupts have been disabled. Unfortunately this
> pre-condition got broken in v2.6.32.
> 
> Fixes: 5cfc8ee0bb51 ("ARM: convert arm to arch_gettimeoffset()")
> Signed-off-by: Finn Thain <fthain@telegraphics.com.au>

This looks like a sensible fix for -stable backporting.  But with m68k
converted over it might be time for these two arm platforms to either
be converted to the clocksource API or to be dropped..
Finn Thain Nov. 13, 2018, 3:39 a.m. UTC | #2
On Mon, 12 Nov 2018, Christoph Hellwig wrote:

> On Mon, Nov 12, 2018 at 03:12:39PM +1100, Finn Thain wrote:
> > Implementations of arch_gettimeoffset are generally not re-entrant and 
> > assume that interrupts have been disabled. Unfortunately this 
> > pre-condition got broken in v2.6.32.
> > 
> > Fixes: 5cfc8ee0bb51 ("ARM: convert arm to arch_gettimeoffset()")
> > Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
> 
> This looks like a sensible fix for -stable backporting.  But with m68k
> converted over it might be time for these two arm platforms to either
> be converted to the clocksource API or to be dropped..
> 

You could remove the old arch_gettimeoffset API without dropping any 
platforms.

If no-one converts a given platform to the clocksource API it would mean 
that the default 'jiffies' clocksource will get used on that platform.

Clock resolution and timer precision would be degraded, but that might not 
matter.

Anyway, if someone who has this hardware is willing to test a clocksource 
API conversion, they can let me know and I'll attempt that patch.
Russell King (Oracle) Nov. 13, 2018, 9:20 a.m. UTC | #3
On Tue, Nov 13, 2018 at 02:39:00PM +1100, Finn Thain wrote:
> On Mon, 12 Nov 2018, Christoph Hellwig wrote:
> 
> > On Mon, Nov 12, 2018 at 03:12:39PM +1100, Finn Thain wrote:
> > > Implementations of arch_gettimeoffset are generally not re-entrant and 
> > > assume that interrupts have been disabled. Unfortunately this 
> > > pre-condition got broken in v2.6.32.
> > > 
> > > Fixes: 5cfc8ee0bb51 ("ARM: convert arm to arch_gettimeoffset()")
> > > Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
> > 
> > This looks like a sensible fix for -stable backporting.  But with m68k
> > converted over it might be time for these two arm platforms to either
> > be converted to the clocksource API or to be dropped..
> > 
> 
> You could remove the old arch_gettimeoffset API without dropping any 
> platforms.
> 
> If no-one converts a given platform to the clocksource API it would mean 
> that the default 'jiffies' clocksource will get used on that platform.
> 
> Clock resolution and timer precision would be degraded, but that might not 
> matter.
> 
> Anyway, if someone who has this hardware is willing to test a clocksource 
> API conversion, they can let me know and I'll attempt that patch.

There's reasons why that's not appropriate - such as not having two
separate timers in order to supply a clocksource and separate clock
event.

Not all hardware is suited to the clocksource + clockevent idea.
Finn Thain Nov. 13, 2018, 9:55 p.m. UTC | #4
On Tue, 13 Nov 2018, Russell King - ARM Linux wrote:

> On Tue, Nov 13, 2018 at 02:39:00PM +1100, Finn Thain wrote:
> > 
> > You could remove the old arch_gettimeoffset API without dropping any 
> > platforms.
> > 
> > If no-one converts a given platform to the clocksource API it would mean 
> > that the default 'jiffies' clocksource will get used on that platform.
> > 
> > Clock resolution and timer precision would be degraded, but that might not 
> > matter.
> > 
> > Anyway, if someone who has this hardware is willing to test a clocksource 
> > API conversion, they can let me know and I'll attempt that patch.
> 
> There's reasons why that's not appropriate - such as not having two
> separate timers in order to supply a clocksource and separate clock
> event.
> 
> Not all hardware is suited to the clocksource + clockevent idea.
> 

Sorry, I don't follow.

AFAIK, clocksources and clock event devices are orthogonal concepts. There 
are platforms with !ARCH_USES_GETTIMEOFFSET && !GENERIC_CLOCKEVENTS (and 
every other combination).

A clocksource read method just provides a cycle count, and in this sense 
arch_gettimeoffset() is equivalent to a clocksource.

If these two arm platforms have an existing clock event device which 
somehow precludes any new clocksources, why doesn't that also render 
arch_gettimeoffset() impossible?
Russell King (Oracle) Nov. 13, 2018, 11:43 p.m. UTC | #5
On Wed, Nov 14, 2018 at 08:55:37AM +1100, Finn Thain wrote:
> On Tue, 13 Nov 2018, Russell King - ARM Linux wrote:
> 
> > On Tue, Nov 13, 2018 at 02:39:00PM +1100, Finn Thain wrote:
> > > 
> > > You could remove the old arch_gettimeoffset API without dropping any 
> > > platforms.
> > > 
> > > If no-one converts a given platform to the clocksource API it would mean 
> > > that the default 'jiffies' clocksource will get used on that platform.
> > > 
> > > Clock resolution and timer precision would be degraded, but that might not 
> > > matter.
> > > 
> > > Anyway, if someone who has this hardware is willing to test a clocksource 
> > > API conversion, they can let me know and I'll attempt that patch.
> > 
> > There's reasons why that's not appropriate - such as not having two
> > separate timers in order to supply a clocksource and separate clock
> > event.
> > 
> > Not all hardware is suited to the clocksource + clockevent idea.
> > 
> 
> Sorry, I don't follow.
> 
> AFAIK, clocksources and clock event devices are orthogonal concepts. There 
> are platforms with !ARCH_USES_GETTIMEOFFSET && !GENERIC_CLOCKEVENTS (and 
> every other combination).
> 
> A clocksource read method just provides a cycle count, and in this sense 
> arch_gettimeoffset() is equivalent to a clocksource.

A clocksource provides a cycle counter that monotonically changes and
does not wrap between clockevent events.

A clock event is responsible for providing events to the system when
some work is needing to be done, limited by the wrap interval of the
clocksource.

Each time the clock event triggers an interrupt, the clocksource is
read to determine how much time has passed, using:

	count = (new_value - old_value) & available_bits
	nanosecs = count * scale >> shift;

If you try to combine the clocksource and clockevent because you only
have a single counter, and the counter has the behaviour of:
- counting down towards zero
- when reaching zero, triggers an interrupt, and reloads with N

then this provides your clockevent, but you can't use this as a clock
source, because each time you receive an interrupt and try to read the
counter value, it will be approximately the same value.  This means
that the above calculation fails to register the correct number of
nanoseconds passing.  Hence, this does not work.

Also note where I said above that the clock event device must be able
to provide an interrupt _before_ the clocksource wraps - clearly with
such a timer, that is utterly impossible.

The simple solution is to not use such a counter as the clocksource,
which means you fall back to using the jiffies clocksource, and your
timekeeping has jiffy resolution - so 10ms, or possibly 1ms if you
want a 1kHz timer interval.  For most applications, that's simply way
to coarse, as was realised in the very early days of Linux.

If only there was a way to interpolate between timer interrupts...
which is exactly what arch_gettimeoffset() does, and is a historical
reminant of the pre-clocksource/clockevent days of the kernel - but
it is the only way to get better resolution from this sort of setup.
Michael Schmitz Nov. 14, 2018, 1:35 a.m. UTC | #6
On 14/11/18 12:43 PM, Russell King - ARM Linux wrote:
> On Wed, Nov 14, 2018 at 08:55:37AM +1100, Finn Thain wrote:
>> On Tue, 13 Nov 2018, Russell King - ARM Linux wrote:
>>
>>> On Tue, Nov 13, 2018 at 02:39:00PM +1100, Finn Thain wrote:
>>>> You could remove the old arch_gettimeoffset API without dropping any
>>>> platforms.
>>>>
>>>> If no-one converts a given platform to the clocksource API it would mean
>>>> that the default 'jiffies' clocksource will get used on that platform.
>>>>
>>>> Clock resolution and timer precision would be degraded, but that might not
>>>> matter.
>>>>
>>>> Anyway, if someone who has this hardware is willing to test a clocksource
>>>> API conversion, they can let me know and I'll attempt that patch.
>>> There's reasons why that's not appropriate - such as not having two
>>> separate timers in order to supply a clocksource and separate clock
>>> event.
>>>
>>> Not all hardware is suited to the clocksource + clockevent idea.
>>>
>> Sorry, I don't follow.
>>
>> AFAIK, clocksources and clock event devices are orthogonal concepts. There
>> are platforms with !ARCH_USES_GETTIMEOFFSET && !GENERIC_CLOCKEVENTS (and
>> every other combination).
>>
>> A clocksource read method just provides a cycle count, and in this sense
>> arch_gettimeoffset() is equivalent to a clocksource.
> A clocksource provides a cycle counter that monotonically changes and
> does not wrap between clockevent events.
>
> A clock event is responsible for providing events to the system when
> some work is needing to be done, limited by the wrap interval of the
> clocksource.
>
> Each time the clock event triggers an interrupt, the clocksource is
> read to determine how much time has passed, using:
>
> 	count = (new_value - old_value) & available_bits
> 	nanosecs = count * scale >> shift;
>
> If you try to combine the clocksource and clockevent because you only
> have a single counter, and the counter has the behaviour of:
> - counting down towards zero
> - when reaching zero, triggers an interrupt, and reloads with N
>
> then this provides your clockevent, but you can't use this as a clock
> source, because each time you receive an interrupt and try to read the
> counter value, it will be approximately the same value.  This means
> that the above calculation fails to register the correct number of
> nanoseconds passing.  Hence, this does not work.

So we'd still have to use jiffies + interpolation from the current timer 
rundown counter as clocksource (since that will be monotonous and won't 
wrap)?

The way Finn did the clocksource for m68k, the clocksource counter does 
behave as it should (monotonous, doesn't wrap) at least so far as I've 
tested. Using the same timer for clocksource and clock events will 
degrade timekeeping based on timer events to jiffies resolution, as you 
point out. That would already have been the case before, so no gain in 
resolution. Other timekeeping would have worked at higher resolution 
before (interpolating through arch_gettimeoffset) just the same as now 
(interpolating through clocksource reads). Finn's clocksource read code 
essentially is arch_gettimeoffset under a new name.

Are you saying that's not possible on arm, because the current timer 
rundown counter can't be read while the timer is running?

If I were to run a second timer at higher rate for clocksource, but 
keeping the 10 ms timer as clock event (could easily do that by using 
timer D on Atari Falcon) - how would that improve my timekeeping? Clock 
events still only happen 10 ms apart ...

Cheers,

     Michael

>
> Also note where I said above that the clock event device must be able
> to provide an interrupt _before_ the clocksource wraps - clearly with
> such a timer, that is utterly impossible.
>
> The simple solution is to not use such a counter as the clocksource,
> which means you fall back to using the jiffies clocksource, and your
> timekeeping has jiffy resolution - so 10ms, or possibly 1ms if you
> want a 1kHz timer interval.  For most applications, that's simply way
> to coarse, as was realised in the very early days of Linux.
>
> If only there was a way to interpolate between timer interrupts...
> which is exactly what arch_gettimeoffset() does, and is a historical
> reminant of the pre-clocksource/clockevent days of the kernel - but
> it is the only way to get better resolution from this sort of setup.
>
Finn Thain Nov. 14, 2018, 3:17 a.m. UTC | #7
On Tue, 13 Nov 2018, Russell King - ARM Linux wrote:

> 
> A clocksource provides a cycle counter that monotonically changes and 
> does not wrap between clockevent events.
> 
> A clock event is responsible for providing events to the system when 
> some work is needing to be done, limited by the wrap interval of the 
> clocksource.
> 
> Each time the clock event triggers an interrupt, the clocksource is
> read to determine how much time has passed, using:
> 
> 	count = (new_value - old_value) & available_bits
> 	nanosecs = count * scale >> shift;
> 
> If you try to combine the clocksource and clockevent because you only
> have a single counter, and the counter has the behaviour of:
> - counting down towards zero
> - when reaching zero, triggers an interrupt, and reloads with N
> 
> then this provides your clockevent, but you can't use this as a clock
> source, because each time you receive an interrupt and try to read the
> counter value, it will be approximately the same value.  This means
> that the above calculation fails to register the correct number of
> nanoseconds passing.  Hence, this does not work.
> 
> Also note where I said above that the clock event device must be able
> to provide an interrupt _before_ the clocksource wraps - clearly with
> such a timer, that is utterly impossible.
> 
> The simple solution is to not use such a counter as the clocksource,
> which means you fall back to using the jiffies clocksource, and your
> timekeeping has jiffy resolution - so 10ms, or possibly 1ms if you
> want a 1kHz timer interval.  For most applications, that's simply way
> to coarse, as was realised in the very early days of Linux.
> 
> If only there was a way to interpolate between timer interrupts...
> which is exactly what arch_gettimeoffset() does, and is a historical
> reminant of the pre-clocksource/clockevent days of the kernel - but
> it is the only way to get better resolution from this sort of setup.
> 

Both of the platforms in question (RPC and EBSA110) have not 
defined(CONFIG_GENERIC_CLOCKEVENTS) and have not defined any struct 
clock_event_device, AFAICT.

So, even assuming that you're right about the limitations of single-timer 
platforms in general, removal of arch_gettimeoffset wouldn't require the 
removal of any platforms, AFAICT.
Russell King (Oracle) Nov. 14, 2018, 7:58 a.m. UTC | #8
On Wed, Nov 14, 2018 at 02:35:29PM +1300, Michael Schmitz wrote:
> So we'd still have to use jiffies + interpolation from the current timer
> rundown counter as clocksource (since that will be monotonous and won't
> wrap)?
> 
> The way Finn did the clocksource for m68k, the clocksource counter does
> behave as it should (monotonous, doesn't wrap) at least so far as I've
> tested. Using the same timer for clocksource and clock events will degrade
> timekeeping based on timer events to jiffies resolution, as you point out.
> That would already have been the case before, so no gain in resolution.

... and that is where you are wrong.

RPC, for example, has gettimeofday() resolution of 500ns.  Removing
gettimeoffset() will result in a resolution of 1/HZ since there is
no longer any interpolation between interrupts.

> Other timekeeping would have worked at higher resolution before
> (interpolating through arch_gettimeoffset) just the same as now
> (interpolating through clocksource reads). Finn's clocksource read code
> essentially is arch_gettimeoffset under a new name.

Where is this code - all I've seen is code adding IRQ exclusion in
the gettimeoffset method.  If some other solution is being proposed,
then it's no good beating about the bush - show the damn code, and
then that can be considered.

However, what has been said in this thread is basically "remove the
gettimeoffset method", which is _not_ acceptable, it will cause
gettimeofday() on these platforms to lose resolution, which is a
user visible REGRESSION, plain and simple.

If what is actually meant is "we have a replacement for gettimeoffset"
then, and I'm sure we all know this, there needs to be a patch
proposed showing what is being proposed, rather than waffling around
the issue.

> Are you saying that's not possible on arm, because the current timer rundown
> counter can't be read while the timer is running?
> 
> If I were to run a second timer at higher rate for clocksource, but keeping
> the 10 ms timer as clock event (could easily do that by using timer D on
> Atari Falcon) - how would that improve my timekeeping? Clock events still
> only happen 10 ms apart ...

Ah, I think you're talking about something else.

You seem to be talking about what happens when time keeping interrupts
happen.

I'm talking about the resolution of gettimeofday() and the other
interfaces that are used (eg) for packet capture timestamping and
the like - the _user_ visible effects of the timekeeping system.

With the existing implementation, all these have better-than-jiffy
resolution - in the case of RPC, that's 500ns, rather than 10ms
which would be the case without gettimeoffset().  Removing
gettimeoffset() as Finn has proposed without preserving that
resolution is simply completely unacceptable.
Russell King (Oracle) Nov. 14, 2018, 12:05 p.m. UTC | #9
On Mon, Nov 12, 2018 at 03:12:39PM +1100, Finn Thain wrote:
> Implementations of arch_gettimeoffset are generally not re-entrant
> and assume that interrupts have been disabled. Unfortunately this
> pre-condition got broken in v2.6.32.

To me, it looks way worse than what you think.

The original code sequence before conversion to arch_gettimeoffset()
was:

1. lock (inc. disabling IRQs)
2. read gettimeoffset correction
3. add offset to current xtime
4. unlock

This means there is no chance for a timer interrupt to happen between
reading the current offset and adding it to the current kernel time.
If the timer does roll over, then the interrupt will remain pending,
so the whole "read offset and add to xtime" is one atomic block and
we can use the pending interrupt to indicate whether the timer has
rolled over and apply the appropriate offset correction.

With the arch_gettimeoffset() code, that changes:

1. read clocksource (which will be jiffies)
2. compute clocksource delta
3. increment nanoseconds
4. add gettimeoffset correction

If we receive a timer interrupt part-way through this, for example,
between 2 and 3, then jiffies will increment (via do_timer()) and the
interrupt will be cleared.  This means that the check in
ioc_timer_gettimeoffset() for a pending interrupt, for example, will
be false, and we will not know that an interrupt has happened.

So, unless I'm missing something, commit 5cfc8ee0bb51 well and truely
broke the accuracy of timekeeping on these platforms, and will result
in timekeeping spontaneously jumping backwards.  I had been wondering
why NTP had become less stable on EBSA110, but never investigated.

However, that still does not justify blowing away this functionality
without replacement, as Finn has been proposing.
Russell King (Oracle) Nov. 14, 2018, 2:16 p.m. UTC | #10
On Wed, Nov 14, 2018 at 02:17:09PM +1100, Finn Thain wrote:
> On Tue, 13 Nov 2018, Russell King - ARM Linux wrote:
> 
> > 
> > A clocksource provides a cycle counter that monotonically changes and 
> > does not wrap between clockevent events.
> > 
> > A clock event is responsible for providing events to the system when 
> > some work is needing to be done, limited by the wrap interval of the 
> > clocksource.
> > 
> > Each time the clock event triggers an interrupt, the clocksource is
> > read to determine how much time has passed, using:
> > 
> > 	count = (new_value - old_value) & available_bits
> > 	nanosecs = count * scale >> shift;
> > 
> > If you try to combine the clocksource and clockevent because you only
> > have a single counter, and the counter has the behaviour of:
> > - counting down towards zero
> > - when reaching zero, triggers an interrupt, and reloads with N
> > 
> > then this provides your clockevent, but you can't use this as a clock
> > source, because each time you receive an interrupt and try to read the
> > counter value, it will be approximately the same value.  This means
> > that the above calculation fails to register the correct number of
> > nanoseconds passing.  Hence, this does not work.
> > 
> > Also note where I said above that the clock event device must be able
> > to provide an interrupt _before_ the clocksource wraps - clearly with
> > such a timer, that is utterly impossible.
> > 
> > The simple solution is to not use such a counter as the clocksource,
> > which means you fall back to using the jiffies clocksource, and your
> > timekeeping has jiffy resolution - so 10ms, or possibly 1ms if you
> > want a 1kHz timer interval.  For most applications, that's simply way
> > to coarse, as was realised in the very early days of Linux.
> > 
> > If only there was a way to interpolate between timer interrupts...
> > which is exactly what arch_gettimeoffset() does, and is a historical
> > reminant of the pre-clocksource/clockevent days of the kernel - but
> > it is the only way to get better resolution from this sort of setup.
> > 
> 
> Both of the platforms in question (RPC and EBSA110) have not 
> defined(CONFIG_GENERIC_CLOCKEVENTS) and have not defined any struct 
> clock_event_device, AFAICT.

Correct, because they don't use clockevents.  This is pre-clocksource,
pre-clockevent code, it uses the old timekeeping infrastructure,
which is xtime_update() and providing a gettimeoffset implementation.

> So, even assuming that you're right about the limitations of single-timer 
> platforms in general, removal of arch_gettimeoffset wouldn't require the 
> removal of any platforms, AFAICT.

I haven't proposed removing platforms.

I'm just objecting to the idea of removing arch_gettimeoffset(),
thereby causing a regression by changing the resolution of
gettimeofday() without any sign of equivalent functionality.

However, I now see (having searched mailing lists) what you are
trying to do - you have _not_ copied me or the mailing lists I'm
on with your cover message, so I'm *totally* lacking in the context
of your patch series, particularly where you are converting m68k
to use clocksources without needing the gettimeoffset() stuff.

You have failed to explain that in this thread - probably assuming
that I've read your cover message.  I haven't until now, because
you never sent it to me or the linux-arm-kernel mailing list.

I have found this thread _very_ frustrating, and frankly a waste of
my time discussing the finer points because of this lack of context.
Please ensure that if you're going to be sending a patch series,
that the cover message at least finds its way to the intended
audience of your patches, so that everyone has the context they
need when looking at (eg) the single patch they may receive.

Alternatively, if someone raises a problem with the patch, and you
_know_ you haven't done that, then please consider informing them
where they can get more context, eg, by providing a link to your
archived cover message.  It would help avoid misunderstandings.

Thanks.
Geert Uytterhoeven Nov. 14, 2018, 2:58 p.m. UTC | #11
Hi Russell,

On Wed, Nov 14, 2018 at 3:16 PM Russell King - ARM Linux
<linux@armlinux.org.uk> wrote:
> On Wed, Nov 14, 2018 at 02:17:09PM +1100, Finn Thain wrote:
> > So, even assuming that you're right about the limitations of single-timer
> > platforms in general, removal of arch_gettimeoffset wouldn't require the
> > removal of any platforms, AFAICT.
>
> I haven't proposed removing platforms.
>
> I'm just objecting to the idea of removing arch_gettimeoffset(),
> thereby causing a regression by changing the resolution of
> gettimeofday() without any sign of equivalent functionality.
>
> However, I now see (having searched mailing lists) what you are
> trying to do - you have _not_ copied me or the mailing lists I'm
> on with your cover message, so I'm *totally* lacking in the context
> of your patch series, particularly where you are converting m68k
> to use clocksources without needing the gettimeoffset() stuff.
>
> You have failed to explain that in this thread - probably assuming
> that I've read your cover message.  I haven't until now, because
> you never sent it to me or the linux-arm-kernel mailing list.
>
> I have found this thread _very_ frustrating, and frankly a waste of
> my time discussing the finer points because of this lack of context.
> Please ensure that if you're going to be sending a patch series,
> that the cover message at least finds its way to the intended
> audience of your patches, so that everyone has the context they
> need when looking at (eg) the single patch they may receive.
>
> Alternatively, if someone raises a problem with the patch, and you
> _know_ you haven't done that, then please consider informing them
> where they can get more context, eg, by providing a link to your
> archived cover message.  It would help avoid misunderstandings.

Sorry for the lack of context.
The real trigger was also not explained in the cover message, and was a
the threat to remove platforms not using modern timekeeping APIs, cfr.
 "Removing support for old hardware from the kernel"
(https://lwn.net/Articles/769468/).

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
Russell King (Oracle) Nov. 14, 2018, 6:11 p.m. UTC | #12
On Wed, Nov 14, 2018 at 03:58:36PM +0100, Geert Uytterhoeven wrote:
> Hi Russell,
> 
> On Wed, Nov 14, 2018 at 3:16 PM Russell King - ARM Linux
> <linux@armlinux.org.uk> wrote:
> > On Wed, Nov 14, 2018 at 02:17:09PM +1100, Finn Thain wrote:
> > > So, even assuming that you're right about the limitations of single-timer
> > > platforms in general, removal of arch_gettimeoffset wouldn't require the
> > > removal of any platforms, AFAICT.
> >
> > I haven't proposed removing platforms.
> >
> > I'm just objecting to the idea of removing arch_gettimeoffset(),
> > thereby causing a regression by changing the resolution of
> > gettimeofday() without any sign of equivalent functionality.
> >
> > However, I now see (having searched mailing lists) what you are
> > trying to do - you have _not_ copied me or the mailing lists I'm
> > on with your cover message, so I'm *totally* lacking in the context
> > of your patch series, particularly where you are converting m68k
> > to use clocksources without needing the gettimeoffset() stuff.
> >
> > You have failed to explain that in this thread - probably assuming
> > that I've read your cover message.  I haven't until now, because
> > you never sent it to me or the linux-arm-kernel mailing list.
> >
> > I have found this thread _very_ frustrating, and frankly a waste of
> > my time discussing the finer points because of this lack of context.
> > Please ensure that if you're going to be sending a patch series,
> > that the cover message at least finds its way to the intended
> > audience of your patches, so that everyone has the context they
> > need when looking at (eg) the single patch they may receive.
> >
> > Alternatively, if someone raises a problem with the patch, and you
> > _know_ you haven't done that, then please consider informing them
> > where they can get more context, eg, by providing a link to your
> > archived cover message.  It would help avoid misunderstandings.
> 
> Sorry for the lack of context.
> The real trigger was also not explained in the cover message, and was a
> the threat to remove platforms not using modern timekeeping APIs, cfr.
>  "Removing support for old hardware from the kernel"
> (https://lwn.net/Articles/769468/).

And naturally, because kernel developers are oh so great at
communication, that decision has been communicated to those
affected by it. *NOT*.

Clearly there is a need for much better communication.  We're
better at it when dealing with patches, but not when it comes to
physical meetings.

Maybe when some decision like "we're going to kill stuff still
using the old gettimeoffset API" then _someone_ needs to identify
which platforms are using that and make sure that those maintainers
know about that decision, much the same way that if a patch were
to touch the gettimeoffset API, we'd make damn sure that such a
patch went to those affected by the change.
Michael Schmitz Nov. 15, 2018, 1:34 a.m. UTC | #13
On 14/11/18 8:58 PM, Russell King - ARM Linux wrote:
>
>> Are you saying that's not possible on arm, because the current timer rundown
>> counter can't be read while the timer is running?
>>
>> If I were to run a second timer at higher rate for clocksource, but keeping
>> the 10 ms timer as clock event (could easily do that by using timer D on
>> Atari Falcon) - how would that improve my timekeeping? Clock events still
>> only happen 10 ms apart ...
> Ah, I think you're talking about something else.
>
> You seem to be talking about what happens when time keeping interrupts
> happen.
That's what I understood your comment was about.
> I'm talking about the resolution of gettimeofday() and the other
> interfaces that are used (eg) for packet capture timestamping and
> the like - the _user_ visible effects of the timekeeping system.
>
> With the existing implementation, all these have better-than-jiffy
> resolution - in the case of RPC, that's 500ns, rather than 10ms
> which would be the case without gettimeoffset().  Removing
> gettimeoffset() as Finn has proposed without preserving that
> resolution is simply completely unacceptable.

I agree - but Finn had also offered to supply patches to arm that would 
have added a clocksource_read() function very much like for those m68k 
platforms that had used gettimeoffset(). I wondered what reason there 
was for these not to work on arm

I now realize you'd never seen that offer.

The proposal to drop architectures still relying on arch_gettimeoffset() 
had been raising enough of a response on linux-m68k to make me conclude 
that same proposal had been kicked on to other arch MLs affected as 
well. I'm a bit naive that way.

Now your criticism of arch_gettimeoffset() (missing timer rollover 
because the timer interrupt has been cleared by the interrupt handler) 
still stands. I just can't find the offset() functions shown in the 
5cfc8ee0bb51 patch. Any hints?

Cheers,

     Michael
Finn Thain Nov. 15, 2018, 4:12 a.m. UTC | #14
On Wed, 14 Nov 2018, Russell King - ARM Linux wrote:

> However, I now see (having searched mailing lists) what you are trying 
> to do - you have _not_ copied me or the mailing lists I'm on with your 
> cover message, so I'm *totally* lacking in the context of your patch 
> series, particularly where you are converting m68k to use clocksources 
> without needing the gettimeoffset() stuff.
> 

True. I should have included all interested parties in the recipients of 
the cover letter. My bad.

> You have failed to explain that in this thread - probably assuming that 
> I've read your cover message.

I offered to write a patch to add a clocksource to replace the 
arch_gettimeoffset functionality for RPC and EBSA110.

You even replied to that offer.

I did not propose degrading functionality while there is someone able to 
test modernization patches (assuming there is...).

Would you consider merging untested modernization patches for the 
arch_gettimeoffset API?

> I haven't until now, because you never sent it to me or the 
> linux-arm-kernel mailing list.
> 
> I have found this thread _very_ frustrating, and frankly a waste of my 
> time discussing the finer points because of this lack of context.

Sorry for any frustration I've caused you.

The thread went way off-topic when Christoph took the opportunity to 
suggest the removal of RPC and EBSA110. That doesn't interest me.

My interest remains API modernization. The actual patches I've sent are 
intended to modernize the clock API *without* degrading any functionality.

> Please ensure that if you're going to be sending a patch series, that 
> the cover message at least finds its way to the intended audience of 
> your patches, so that everyone has the context they need when looking at 
> (eg) the single patch they may receive.
> 

OK. I'll have to improve my patch submission scripts.
Russell King (Oracle) Nov. 16, 2018, 5:47 p.m. UTC | #15
On Thu, Nov 15, 2018 at 03:12:17PM +1100, Finn Thain wrote:
> The thread went way off-topic when Christoph took the opportunity to 
> suggest the removal of RPC and EBSA110. That doesn't interest me.

I suspect that is the right solution - I've been trying to get 4.19
to boot on my RPC and it's proving very difficult for several reasons,
not limited to the HDD seeming to be throwing the odd disk error, as
well as the kernel being rather large (a 4.19 kernel is 2.7M compressed
as opposed to 2.6.29 being 1.2M compressed for the equivalent
configuration.)

Given that memory on the RPC is not contiguous, that probably means
an uncompressed kernel overflows the size of the first memory bank,
so there's probably no hope for modern kernels to boot on the machine.

The EBSA110 is probably in a similar boat - I don't remember whether it
had 16MB or 32MB as the maximal amount of memory, but memory was getting
tight with some kernels even running a minimalist userspace.

So, it's probably time to say goodbyte to the kernel support for these
platforms.
Finn Thain Nov. 16, 2018, 10:49 p.m. UTC | #16
On Fri, 16 Nov 2018, Russell King - ARM Linux wrote:

> 
> The EBSA110 is probably in a similar boat - I don't remember whether it 
> had 16MB or 32MB as the maximal amount of memory, but memory was getting 
> tight with some kernels even running a minimalist userspace.
> 
> So, it's probably time to say goodbyte to the kernel support for these 
> platforms.
> 

Your call.

Note that removing code from mainline won't help users obtain older, 
smaller, -stable kernel releases, free from the bug we were discussing. 
(The bug appeared in Linux v2.6.32.)

BTW, if you did want to boot Linux on a 16 MB system, you do have some 
options.

https://lwn.net/Articles/741494/
https://lwn.net/Articles/608945/
https://tiny.wiki.kernel.org/

Contributing to this kind of effort probably has value for IoT 
deployments. I suspect it also cuts a small amount of bloat from a large 
number of other Linux systems.
Michael Schmitz Nov. 17, 2018, 3 a.m. UTC | #17
Hi Finn,

Am 17.11.2018 um 11:49 schrieb Finn Thain:
> On Fri, 16 Nov 2018, Russell King - ARM Linux wrote:
>
>>
>> The EBSA110 is probably in a similar boat - I don't remember whether it
>> had 16MB or 32MB as the maximal amount of memory, but memory was getting
>> tight with some kernels even running a minimalist userspace.
>>
>> So, it's probably time to say goodbyte to the kernel support for these
>> platforms.
>>
>
> Your call.
>
> Note that removing code from mainline won't help users obtain older,
> smaller, -stable kernel releases, free from the bug we were discussing.
> (The bug appeared in Linux v2.6.32.)
>
> BTW, if you did want to boot Linux on a 16 MB system, you do have some
> options.
>
> https://lwn.net/Articles/741494/
> https://lwn.net/Articles/608945/
> https://tiny.wiki.kernel.org/
>
> Contributing to this kind of effort probably has value for IoT
> deployments. I suspect it also cuts a small amount of bloat from a large
> number of other Linux systems.

I boot 4.19 on a system with 14 MB RAM - 10 MB remain for user space 
once the kernel loads. After remote login, 4 MB of that remain free for 
buffers or user code (1.5 MB swapped).
That's with sysvinit - Christian tried to boot a systemd userland on his 
Falcon once, and ran out of memory before swap could be activated.

I shouldn't say 16 or 32 MB are hopeless. And the 2.6 kernels were a lot 
more sluggish to my recollection.

Cheers,

	Michael
diff mbox series

Patch

diff --git a/arch/arm/mach-ebsa110/core.c b/arch/arm/mach-ebsa110/core.c
index 688e5fed49a7..479f89a1accf 100644
--- a/arch/arm/mach-ebsa110/core.c
+++ b/arch/arm/mach-ebsa110/core.c
@@ -160,12 +160,17 @@  static void __init ebsa110_init_early(void)
  */
 static u32 ebsa110_gettimeoffset(void)
 {
+	unsigned long flags;
 	unsigned long offset, count;
 
+	local_irq_save(flags);
+
 	__raw_writeb(0x40, PIT_CTRL);
 	count = __raw_readb(PIT_T1);
 	count |= __raw_readb(PIT_T1) << 8;
 
+	local_irq_restore(flags);
+
 	/*
 	 * If count > COUNT, make the number negative.
 	 */
diff --git a/arch/arm/mach-rpc/time.c b/arch/arm/mach-rpc/time.c
index 2689771c1d38..852bb3801638 100644
--- a/arch/arm/mach-rpc/time.c
+++ b/arch/arm/mach-rpc/time.c
@@ -29,9 +29,12 @@ 
 
 static u32 ioc_timer_gettimeoffset(void)
 {
+	unsigned long flags;
 	unsigned int count1, count2, status;
 	long offset;
 
+	local_irq_save(flags);
+
 	ioc_writeb (0, IOC_T0LATCH);
 	barrier ();
 	count1 = ioc_readb(IOC_T0CNTL) | (ioc_readb(IOC_T0CNTH) << 8);
@@ -42,6 +45,8 @@  static u32 ioc_timer_gettimeoffset(void)
 	barrier ();
 	count2 = ioc_readb(IOC_T0CNTL) | (ioc_readb(IOC_T0CNTH) << 8);
 
+	local_irq_restore(flags);
+
 	offset = count2;
 	if (count2 < count1) {
 		/*