[RFC,v1,00/10] archs/random: fallback to using sched_clock() if no cycle counter

Message ID	20220408182145.142506-1-Jason@zx2c4.com (mailing list archive)
Headers	show Return-Path: <linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org> From: "Jason A. Donenfeld" <Jason@zx2c4.com> To: linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, arnd@arndb.de Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>, Theodore Ts'o <tytso@mit.edu>, Dominik Brodowski <linux@dominikbrodowski.net>, Russell King <linux@armlinux.org.uk>, Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, Geert Uytterhoeven <geert@linux-m68k.org>, Thomas Bogendoerfer <tsbogend@alpha.franken.de>, Paul Walmsley <paul.walmsley@sifive.com>, Palmer Dabbelt <palmer@dabbelt.com>, Albert Ou <aou@eecs.berkeley.edu>, "David S . Miller" <davem@davemloft.net>, Richard Weinberger <richard@nod.at>, Anton Ivanov <anton.ivanov@cambridgegreys.com>, Johannes Berg <johannes@sipsolutions.net>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, "H . Peter Anvin" <hpa@zytor.com>, Chris Zankel <chris@zankel.net>, Max Filippov <jcmvbkbc@gmail.com>, John Stultz <john.stultz@linaro.org>, Stephen Boyd <sboyd@kernel.org>, linux-arm-kernel@lists.infradead.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-riscv@lists.infradead.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, x86@kernel.org, linux-xtensa@linux-xtensa.org Subject: [PATCH RFC v1 00/10] archs/random: fallback to using sched_clock() if no cycle counter Date: Fri, 8 Apr 2022 20:21:35 +0200 Message-Id: <20220408182145.142506-1-Jason@zx2c4.com> MIME-Version: 1.0 Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-riscv" <linux-riscv-bounces@lists.infradead.org> Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org
Series	archs/random: fallback to using sched_clock() if no cycle counter \| expand [RFC,v1,00/10] archs/random: fallback to using sched_clock() if no cycle counter [RFC,v1,01/10] random: use sched_clock() for random_get_entropy() if no get_cycles() [RFC,v1,02/10] m68k: use sched_clock() for random_get_entropy() instead of zero [RFC,v1,03/10] riscv: use sched_clock() for random_get_entropy() instead of zero [RFC,v1,04/10] mips: use sched_clock() for random_get_entropy() instead of zero [RFC,v1,05/10] arm: use sched_clock() for random_get_entropy() instead of zero [RFC,v1,06/10] x86: use sched_clock() for random_get_entropy() instead of zero [RFC,v1,07/10] arm64: use sched_clock() for random_get_entropy() instead of zero [RFC,v1,08/10] um: use sched_clock() for random_get_entropy() instead of zero [RFC,v1,09/10] sparc: use sched_clock() for random_get_entropy() instead of zero [RFC,v1,10/10] xtensa: use sched_clock() for random_get_entropy() instead of zero

Jason A. Donenfeld April 8, 2022, 6:21 p.m. UTC

Hi folks,

The RNG uses a function called random_get_entropy() basically anytime
that it needs to timestamp an event. For example, an interrupt comes in,
and we mix a random_get_entropy() into the entropy pool somehow.
Somebody mashes their keyboard or moves their mouse around? We mix a
random_get_entropy() into the entropy pool. It's one of the main
varieties of input.

Unfortunately, it's always 0 on a few platforms. The RNG has accumulated
various hacks to deal with this, but in general it's not great. Surely
we can do better than 0. In fact, *anything* that's not the same exact
value all the time would be better than 0. Even a counter that
increments once per hour would be better than 0! I think you get the
idea.

On most platforms, random_get_entropy() is aliased to get_cycles(),
which makes sense for platforms where get_cycles() is defined. RDTSC,
for example, has all the characteristics we care about for this
function: it's fast to acquire (i.e. acceptable in an irq handler),
pretty high precision, available, forms a 2-monotone distribution, etc.
But for platforms without that, what is the next best thing?

Sometimes the next best thing is architecture-defined. For example,
really old MIPS has the CP0 random register, which isn't a cycle
counter, but is at least something. However, some platforms don't even
have an architecture-defined fallback. In that case, what are we left
with?

By my first guess, we have ktime_get_boottime_ns(), jiffies, and
sched_clock(). It seems like sched_clock() has already done a lot of
work in being always available with some incrementing value, falling
back to jiffies as necessary. So this series goes with that as a
fallback, for when the architecture doesn't define random_get_entropy in
its own way and when there's no working cycle counter.

Another option would be falling back to different things on different
platforms. For example, Arnd mentioned to me that on m68k,
ktime_get_ns() might be better than sched_clock(), because it doesn't
use CONFIG_GENERIC_SCHED_CLOCK and therefore is only as good as jiffies.
Maybe there are other considerations there as well.

This is a bit involved with plumbing asm/ headers, which is why this is
an RFC. There are a few ways of skinning that cat. The patchset also
tries to fill in the various cases where an arch only sometimes has a
cycle counter and sometimes doesn't. When possible, it tries to make the
decisions at compile time, but sometimes runtime decisions are
necessary.

Please let me know if you think this is sane. And if you have a
different candidate than sched_clock(), I'd be interested to learn about
that. In particular, I'd value input here from Thomas or somebody else
who has looked at timekeeping across less common platforms.

Finally, note that this series isn't about "jitter entropy" or other
ways of initializing the RNG. That's a different topic for a different
thread. Please don't let this discussion veer off into that. Here, I'm
just trying to find a good fallback counter/timer for platforms without
get_cycles(), a question with limited scope.

Thanks,
Jason

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: David S. Miller <davem@davemloft.net>
Cc: Richard Weinberger <richard@nod.at>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-mips@vger.kernel.org
Cc: linux-riscv@lists.infradead.org
Cc: sparclinux@vger.kernel.org
Cc: linux-um@lists.infradead.org
Cc: x86@kernel.org
Cc: linux-xtensa@linux-xtensa.org

Jason A. Donenfeld (10):
  random: use sched_clock() for random_get_entropy() if no get_cycles()
  m68k: use sched_clock() for random_get_entropy() instead of zero
  riscv: use sched_clock() for random_get_entropy() instead of zero
  mips: use sched_clock() for random_get_entropy() instead of zero
  arm: use sched_clock() for random_get_entropy() instead of zero
  x86: use sched_clock() for random_get_entropy() instead of zero
  arm64: use sched_clock() for random_get_entropy() instead of zero
  um: use sched_clock() for random_get_entropy() instead of zero
  sparc: use sched_clock() for random_get_entropy() instead of zero
  xtensa: use sched_clock() for random_get_entropy() instead of zero

 arch/arm/include/asm/timex.h      | 11 +++++++++++
 arch/arm64/include/asm/timex.h    |  9 +++++++++
 arch/m68k/include/asm/timex.h     |  4 +++-
 arch/mips/include/asm/timex.h     |  3 ++-
 arch/riscv/include/asm/timex.h    |  3 ++-
 arch/sparc/include/asm/timex_32.h |  4 +---
 arch/um/include/asm/timex.h       |  9 ++-------
 arch/x86/include/asm/tsc.h        | 11 +++++++++++
 arch/xtensa/include/asm/timex.h   |  6 ++----
 include/linux/timex.h             |  6 ++++++
 10 files changed, 49 insertions(+), 17 deletions(-)

Thomas Gleixner April 9, 2022, 11:29 p.m. UTC | #1

Jason,

On Fri, Apr 08 2022 at 20:21, Jason A. Donenfeld wrote:
> Sometimes the next best thing is architecture-defined. For example,
> really old MIPS has the CP0 random register, which isn't a cycle
> counter, but is at least something. However, some platforms don't even
> have an architecture-defined fallback. In that case, what are we left
> with?
>
> By my first guess, we have ktime_get_boottime_ns(), jiffies, and
> sched_clock(). It seems like sched_clock() has already done a lot of
> work in being always available with some incrementing value, falling
> back to jiffies as necessary. So this series goes with that as a
> fallback, for when the architecture doesn't define random_get_entropy in
> its own way and when there's no working cycle counter.

sched_clock() is a halfways sane option, but yes as Arnd pointed out:

> Another option would be falling back to different things on different
> platforms. For example, Arnd mentioned to me that on m68k,
> ktime_get_ns() might be better than sched_clock(), because it doesn't
> use CONFIG_GENERIC_SCHED_CLOCK and therefore is only as good as
> jiffies.

ktime_get_ns() or the slightly faster variant ktime_get_mono_fast_ns()
are usable. In the worst case they are jiffies driven too, but systems
with jiffies clocksource are pretty much museum pieces.

It's slightly slower than get_cycles() and a get_cyles() based
sched_clock(), but you get the most granular clock of the platform
automatically, which has it's charm too :)

The bad news is that depending on the clocksource frequency the lower
bits might never change. Always true for clocksource jiffies.
sched_clock() has the same issue.

But the below uncompiled hack gives you access to the 'best' clocksource
of a machine, i.e. the one which the platform decided to be the one
which is giving the best resolution. The minimal bitwidth of that is
AFAICT 20 bits. In the jiffies case this will at least advance every
tick.

The price, e.g. on x86 would be that RDTSC would be invoked via an
indirect function call. Not the end of the world...

Thanks,

        tglx
---
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -646,6 +646,11 @@ static void halt_fast_timekeeper(const s
 	update_fast_timekeeper(&tkr_dummy, &tk_fast_raw);
 }

+u32 ktime_read_raw_clock(void)
+{
+	return tk_clock_read(&tk_core.timekeeper.tkr_mono);
+}
+
 static RAW_NOTIFIER_HEAD(pvclock_gtod_chain);

 static void update_pvclock_gtod(struct timekeeper *tk, bool was_set)

Jason A. Donenfeld April 10, 2022, 2:25 p.m. UTC | #2

Hi Thomas,

On Sun, Apr 10, 2022 at 01:29:32AM +0200, Thomas Gleixner wrote:
> But the below uncompiled hack gives you access to the 'best' clocksource
> of a machine, i.e. the one which the platform decided to be the one
> which is giving the best resolution. The minimal bitwidth of that is
> AFAICT 20 bits. In the jiffies case this will at least advance every
> tick.

Oh, huh, that's pretty cool. I can try to make a commit out of that. Are
you suggesting I use this as the fallback for all platforms that
currently return zero, or just for m68k per Arnd's suggestion, and then
use sched_clock() for the others? It sounds to me like you're saying
this would be best for all of them. If so, that'd be quite nice.

> The price, e.g. on x86 would be that RDTSC would be invoked via an
> indirect function call. Not the end of the world...

Well on x86, random_get_entropy() is overridden in the arch/ code to
call get_cycles(). So this would really just be for 486 and for other
architectures with no cycle counter that are currently returning zero.
However, this brings up a good point: if your proposed
ktime_read_raw_clock() function really is so nice, should it be used
everywhere unconditionally with no arch-specific overrides? On x86, is
it really guaranteed to be RDTSC, and not, say, some off-core HPET
situation? And is this acceptable to call from the hard irq handler?

Not yet having too much knowledge, I'm tentatively leaning toward the
safe side, of just using ktime_read_raw_clock() in the current places
that return zero all the time -- that is, for the purpose this patchset
has.

Jason

Thomas Gleixner April 10, 2022, 8:57 p.m. UTC | #3

Jason,

On Sun, Apr 10 2022 at 16:25, Jason A. Donenfeld wrote:
> On Sun, Apr 10, 2022 at 01:29:32AM +0200, Thomas Gleixner wrote:
>> But the below uncompiled hack gives you access to the 'best' clocksource
>> of a machine, i.e. the one which the platform decided to be the one
>> which is giving the best resolution. The minimal bitwidth of that is
>> AFAICT 20 bits. In the jiffies case this will at least advance every
>> tick.
>
> Oh, huh, that's pretty cool. I can try to make a commit out of that. Are
> you suggesting I use this as the fallback for all platforms that
> currently return zero, or just for m68k per Arnd's suggestion, and then
> use sched_clock() for the others? It sounds to me like you're saying
> this would be best for all of them. If so, that'd be quite nice.

It's the best in terms of timekeeping. Not the fastest :)

>> The price, e.g. on x86 would be that RDTSC would be invoked via an
>> indirect function call. Not the end of the world...
>
> Well on x86, random_get_entropy() is overridden in the arch/ code to
> call get_cycles(). So this would really just be for 486 and for other
> architectures with no cycle counter that are currently returning zero.
> However, this brings up a good point: if your proposed
> ktime_read_raw_clock() function really is so nice, should it be used
> everywhere unconditionally with no arch-specific overrides? On x86, is
> it really guaranteed to be RDTSC, and not, say, some off-core HPET
> situation? And is this acceptable to call from the hard irq handler?

No, that's the sad part. On system where TSC is unstable (for whatever
reason) this might fallback to some off-core clock (HPET, PMTIMER).
The good news is that this is mostly affecting older systems. After 20+
years of complaining the hardware people seem to have figured out that a
fast accessible and realiable clocksource is something useful. :)

> Not yet having too much knowledge, I'm tentatively leaning toward the
> safe side, of just using ktime_read_raw_clock() in the current places
> that return zero all the time -- that is, for the purpose this patchset
> has.

That's probably a good approach and it's init/runtime discoverable.

Thanks,

        tglx

Jason A. Donenfeld April 10, 2022, 9:38 p.m. UTC | #4

Hi Thomas,

On Sun, Apr 10, 2022 at 10:57 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> > Not yet having too much knowledge, I'm tentatively leaning toward the
> > safe side, of just using ktime_read_raw_clock() in the current places
> > that return zero all the time -- that is, for the purpose this patchset
> > has.
>
> That's probably a good approach and it's init/runtime discoverable.

Alright then. I'll send out a v2 and we'll see how that looks.

Jason

Eric Biggers April 10, 2022, 11:03 p.m. UTC | #5

On Fri, Apr 08, 2022 at 08:21:35PM +0200, Jason A. Donenfeld wrote:
> By my first guess, we have ktime_get_boottime_ns(), jiffies, and
> sched_clock(). It seems like sched_clock() has already done a lot of
> work in being always available with some incrementing value, falling
> back to jiffies as necessary. So this series goes with that as a
> fallback, for when the architecture doesn't define random_get_entropy in
> its own way and when there's no working cycle counter.

Won't this interact badly with how try_to_generate_entropy() (a.k.a. the "Linus
Jitter Dance") detects the presence of an appropriate timer currently?

        stack.cycles = random_get_entropy();

        /* Slow counter - or none. Don't even bother */
        if (stack.cycles == random_get_entropy())
                return;

So if random_get_entropy() always returns 0, then try_to_generate_entropy()
won't run.  However, if random_get_entropy() is even just a low-precision timer,
then try_to_generate_entropy() will have a chance of running, since the timer
might change between the two calls to random_get_entropy().  And if
try_to_generate_entropy() does run, then it credits 1 bit of entropy for every
iteration, regardless of the timer's precision.

This is an existing problem, but this patchset will make it worse, as it changes
a lot of cases from "no timer" to "low precision timer".

Perhaps try_to_generate_entropy() should check the timer at least 3 times and
verify that it changed each time?

- Eric

Jason A. Donenfeld April 10, 2022, 11:29 p.m. UTC | #6

Hi Eric,

On 4/11/22, Eric Biggers <ebiggers@kernel.org> wrote:
> On Fri, Apr 08, 2022 at 08:21:35PM +0200, Jason A. Donenfeld wrote:
>> By my first guess, we have ktime_get_boottime_ns(), jiffies, and
>> sched_clock(). It seems like sched_clock() has already done a lot of
>> work in being always available with some incrementing value, falling
>> back to jiffies as necessary. So this series goes with that as a
>> fallback, for when the architecture doesn't define random_get_entropy in
>> its own way and when there's no working cycle counter.
>
> Won't this interact badly with how try_to_generate_entropy() (a.k.a. the
> "Linus
> Jitter Dance") detects the presence of an appropriate timer currently?
>
>         stack.cycles = random_get_entropy();
>
>         /* Slow counter - or none. Don't even bother */
>         if (stack.cycles == random_get_entropy())
>                 return;
>
> So if random_get_entropy() always returns 0, then try_to_generate_entropy()
> won't run.  However, if random_get_entropy() is even just a low-precision
> timer,
> then try_to_generate_entropy() will have a chance of running, since the
> timer
> might change between the two calls to random_get_entropy().  And if
> try_to_generate_entropy() does run, then it credits 1 bit of entropy for
> every
> iteration, regardless of the timer's precision.
>
> This is an existing problem, but this patchset will make it worse, as it
> changes
> a lot of cases from "no timer" to "low precision timer".
>
> Perhaps try_to_generate_entropy() should check the timer at least 3 times
> and
> verify that it changed each time?

What you've identified is actually already the case for platforms
where the cycle counter is already just slow (and there are a few such
platforms; my odroid C2 even exhibits this). As you identified, the
cycle counter might already be too slow, yet we get [un]lucky and
reach this code right on the cusp or a change.

So the problem isn't new here, per say, for this patchset. But indeed
perhaps we should consider adjusting the heuristics for that a bit in
a separate patch. Your check three times idea seems like a good
starting point, if you want to send a patch and we can poke at it.

Jason

[RFC,v1,00/10] archs/random: fallback to using sched_clock() if no cycle counter

Message

Comments