diff mbox series

random: vary jitter iterations based on cycle counter speed

Message ID 20220422132027.1267060-1-Jason@zx2c4.com (mailing list archive)
State Not Applicable
Delegated to: Herbert Xu
Headers show
Series random: vary jitter iterations based on cycle counter speed | expand

Commit Message

Jason A. Donenfeld April 22, 2022, 1:20 p.m. UTC
Currently, we do the jitter dance if two consecutive reads to the cycle
counter return different values. If they do, then we consider the cycle
counter to be fast enough that one trip through the scheduler will yield
one "bit" of credited entropy. If those two reads return the same value,
then we assume the cycle counter is too slow to show meaningful
differences.

This methodology is flawed for a variety of reasons, one of which Eric
posted a patch to fix in [1]. The issue that patch solves is that on a
system with a slow counter, you might be [un]lucky and read the counter
_just_ before it changes, so that the second cycle counter you read
differs from the first, even though there's usually quite a large period
of time in between the two. For example:

| real time | cycle counter |
| --------- | ------------- |
| 3         | 5             |
| 4         | 5             |
| 5         | 5             |
| 6         | 5             |
| 7         | 5             | <--- a
| 8         | 6             | <--- b
| 9         | 6             | <--- c

If we read the counter at (a) and compare it to (b), we might be fooled
into thinking that it's a fast counter, when in reality it is not. The
solution in [1] is to also compare counter (b) to counter (c), on the
theory that if the counter is _actually_ slow, and (a)!=(b), then
certainly (b)==(c).

This helps solve this particular issue, in one sense, but in another
sense, it mostly functions to disallow jitter entropy on these systems,
rather than simply taking more samples in that case.

Instead, this patch takes a different approach. Right now we assume that
a difference in one set of consecutive samples means one "bit" of
credited entropy per scheduler trip. We can extend this so that a
difference in two sets of consecutive samples means one "bit" of
credited entropy per /two/ scheduler trips, and three for three, and
four for four. In other words, we can increase the amount of jitter
"work" we require for each "bit", depending on how slow the cycle
counter is.

So this patch takes whole bunch of samples, sees how many of them are
different, and divides to find the amount of work required per "bit",
and also requires that at least some minimum of them are different in
order to attempt any jitter entropy.

Note that this approach is still far from perfect. It's not a real
statistical estimate on how much these samples vary; it's not a
real-time analysis of the relevant input data. That remains a project
for another time. However, it does the same (partly flawed) assumptions
as the code that's there now, so it's probably not worse than the status
quo, and it handles the issue Eric mentioned in [1]. But, again, it's
probably a far cry from whatever a really robust version of this would
be.

[1] https://lore.kernel.org/lkml/20220421233152.58522-1-ebiggers@kernel.org/
    https://lore.kernel.org/lkml/20220421192939.250680-1-ebiggers@kernel.org/

Cc: Eric Biggers <ebiggers@google.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
This is an argument very much centered around the somewhat low bar of
being "not worse than before". If you can think of ways that it doesn't
even manage to clear that, please do pipe up.


 drivers/char/random.c | 36 ++++++++++++++++++++++++++----------
 1 file changed, 26 insertions(+), 10 deletions(-)

Comments

Vladimir Murzin July 13, 2022, 2:31 p.m. UTC | #1
Hi All,

On 4/22/22 14:20, Jason A. Donenfeld wrote:
> Currently, we do the jitter dance if two consecutive reads to the cycle
> counter return different values. If they do, then we consider the cycle
> counter to be fast enough that one trip through the scheduler will yield
> one "bit" of credited entropy. If those two reads return the same value,
> then we assume the cycle counter is too slow to show meaningful
> differences.
> 
> This methodology is flawed for a variety of reasons, one of which Eric
> posted a patch to fix in [1]. The issue that patch solves is that on a
> system with a slow counter, you might be [un]lucky and read the counter
> _just_ before it changes, so that the second cycle counter you read
> differs from the first, even though there's usually quite a large period
> of time in between the two. For example:
> 
> | real time | cycle counter |
> | --------- | ------------- |
> | 3         | 5             |
> | 4         | 5             |
> | 5         | 5             |
> | 6         | 5             |
> | 7         | 5             | <--- a
> | 8         | 6             | <--- b
> | 9         | 6             | <--- c
> 
> If we read the counter at (a) and compare it to (b), we might be fooled
> into thinking that it's a fast counter, when in reality it is not. The
> solution in [1] is to also compare counter (b) to counter (c), on the
> theory that if the counter is _actually_ slow, and (a)!=(b), then
> certainly (b)==(c).
> 
> This helps solve this particular issue, in one sense, but in another
> sense, it mostly functions to disallow jitter entropy on these systems,
> rather than simply taking more samples in that case.
> 
> Instead, this patch takes a different approach. Right now we assume that
> a difference in one set of consecutive samples means one "bit" of
> credited entropy per scheduler trip. We can extend this so that a
> difference in two sets of consecutive samples means one "bit" of
> credited entropy per /two/ scheduler trips, and three for three, and
> four for four. In other words, we can increase the amount of jitter
> "work" we require for each "bit", depending on how slow the cycle
> counter is.
> 
> So this patch takes whole bunch of samples, sees how many of them are
> different, and divides to find the amount of work required per "bit",
> and also requires that at least some minimum of them are different in
> order to attempt any jitter entropy.
> 
> Note that this approach is still far from perfect. It's not a real
> statistical estimate on how much these samples vary; it's not a
> real-time analysis of the relevant input data. That remains a project
> for another time. However, it does the same (partly flawed) assumptions
> as the code that's there now, so it's probably not worse than the status
> quo, and it handles the issue Eric mentioned in [1]. But, again, it's
> probably a far cry from whatever a really robust version of this would
> be.
> 
> [1] https://lore.kernel.org/lkml/20220421233152.58522-1-ebiggers@kernel.org/
>     https://lore.kernel.org/lkml/20220421192939.250680-1-ebiggers@kernel.org/
> 
> Cc: Eric Biggers <ebiggers@google.com>
> Cc: Theodore Ts'o <tytso@mit.edu>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> ---
> This is an argument very much centered around the somewhat low bar of
> being "not worse than before". If you can think of ways that it doesn't
> even manage to clear that, please do pipe up.
> 
> 
>  drivers/char/random.c | 36 ++++++++++++++++++++++++++----------
>  1 file changed, 26 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/char/random.c b/drivers/char/random.c
> index bf89c6f27a19..94a2ddb53662 100644
> --- a/drivers/char/random.c
> +++ b/drivers/char/random.c
> @@ -1354,6 +1354,12 @@ void add_interrupt_randomness(int irq)
>  }
>  EXPORT_SYMBOL_GPL(add_interrupt_randomness);
>  
> +struct entropy_timer_state {
> +	unsigned long entropy;
> +	struct timer_list timer;
> +	unsigned int samples, samples_per_bit;
> +};
> +
>  /*
>   * Each time the timer fires, we expect that we got an unpredictable
>   * jump in the cycle counter. Even if the timer is running on another
> @@ -1367,9 +1373,14 @@ EXPORT_SYMBOL_GPL(add_interrupt_randomness);
>   *
>   * So the re-arming always happens in the entropy loop itself.
>   */
> -static void entropy_timer(struct timer_list *t)
> +static void entropy_timer(struct timer_list *timer)
>  {
> -	credit_entropy_bits(1);
> +	struct entropy_timer_state *state = container_of(timer, struct entropy_timer_state, timer);
> +
> +	if (++state->samples == state->samples_per_bit) {
> +		credit_entropy_bits(1);
> +		state->samples = 0;
> +	}
>  }
>  
>  /*
> @@ -1378,17 +1389,22 @@ static void entropy_timer(struct timer_list *t)
>   */
>  static void try_to_generate_entropy(void)
>  {
> -	struct {
> -		unsigned long entropy;
> -		struct timer_list timer;
> -	} stack;
> -
> -	stack.entropy = random_get_entropy();
> +	enum { NUM_TRIAL_SAMPLES = 8192, MAX_SAMPLES_PER_BIT = 256 };
> +	struct entropy_timer_state stack;
> +	unsigned int i, num_different = 0;
> +	unsigned long last = random_get_entropy();
>  
> -	/* Slow counter - or none. Don't even bother */
> -	if (stack.entropy == random_get_entropy())
> +	for (i = 0; i < NUM_TRIAL_SAMPLES - 1; ++i) {
> +		stack.entropy = random_get_entropy();
> +		if (stack.entropy != last)
> +			++num_different;
> +		last = stack.entropy;
> +	}
> +	stack.samples_per_bit = DIV_ROUND_UP(NUM_TRIAL_SAMPLES, num_different + 1);
> +	if (stack.samples_per_bit > MAX_SAMPLES_PER_BIT)
>  		return;
>  
> +	stack.samples = 0;
>  	timer_setup_on_stack(&stack.timer, entropy_timer, 0);
>  	while (!crng_ready() && !signal_pending(current)) {
>  		if (!timer_pending(&stack.timer))


I've just seen on the platform with slow(ish) timer that it is now considered
as a source of entropy with samples_per_bit set to 27 (5.19-rc6 has MAX_SAMPLES_PER_BIT
set to 32). Because of that I see significant delays and I'm trying to understand what
could be wrong with my setup.

I observe one credit_init_bits(1) call (via entropy_timer()) per ~970 schedule() calls.
Is that somewhat expected? Does it make sense at all?

Cheers
Vladimir
Jason A. Donenfeld July 13, 2022, 2:40 p.m. UTC | #2
Hi Vladimir,

On Wed, Jul 13, 2022 at 03:31:05PM +0100, Vladimir Murzin wrote:
> I've just seen on the platform with slow(ish) timer that it is now considered
> as a source of entropy with samples_per_bit set to 27 (5.19-rc6 has MAX_SAMPLES_PER_BIT
> set to 32). Because of that I see significant delays and I'm trying to understand what
> could be wrong with my setup.
> 
> I observe one credit_init_bits(1) call (via entropy_timer()) per ~970 schedule() calls.
> Is that somewhat expected? Does it make sense at all?

How slow are we talking? Seconds? Minutes? Is it too slow? It's possible
that MAX_SAMPLES_PER_BIT=32 is a bit high as a threshold and I should
reduce that a bit.

Also, out of curiosity, why is your timer so slow?

Jason
Vladimir Murzin July 13, 2022, 2:52 p.m. UTC | #3
Hi Jason,

On 7/13/22 15:40, Jason A. Donenfeld wrote:
> Hi Vladimir,
> 
> On Wed, Jul 13, 2022 at 03:31:05PM +0100, Vladimir Murzin wrote:
>> I've just seen on the platform with slow(ish) timer that it is now considered
>> as a source of entropy with samples_per_bit set to 27 (5.19-rc6 has MAX_SAMPLES_PER_BIT
>> set to 32). Because of that I see significant delays and I'm trying to understand what
>> could be wrong with my setup.
>>
>> I observe one credit_init_bits(1) call (via entropy_timer()) per ~970 schedule() calls.
>> Is that somewhat expected? Does it make sense at all?
> 
> How slow are we talking? Seconds? Minutes? Is it too slow? It's possible
> that MAX_SAMPLES_PER_BIT=32 is a bit high as a threshold and I should
> reduce that a bit.
> 

TBH, I run out of patience and never seen it completes, more then seconds. I just was
curious how much it is should take to get crng_ready() return true.

> Also, out of curiosity, why is your timer so slow?

It is part of slow(ish) FPGA.

Cheers
Vladimir

> 
> Jason
Jason A. Donenfeld July 13, 2022, 2:53 p.m. UTC | #4
Hi Vladimir,

On Wed, Jul 13, 2022 at 4:52 PM Vladimir Murzin <vladimir.murzin@arm.com> wrote:
>
> Hi Jason,
>
> On 7/13/22 15:40, Jason A. Donenfeld wrote:
> > Hi Vladimir,
> >
> > On Wed, Jul 13, 2022 at 03:31:05PM +0100, Vladimir Murzin wrote:
> >> I've just seen on the platform with slow(ish) timer that it is now considered
> >> as a source of entropy with samples_per_bit set to 27 (5.19-rc6 has MAX_SAMPLES_PER_BIT
> >> set to 32). Because of that I see significant delays and I'm trying to understand what
> >> could be wrong with my setup.
> >>
> >> I observe one credit_init_bits(1) call (via entropy_timer()) per ~970 schedule() calls.
> >> Is that somewhat expected? Does it make sense at all?
> >
> > How slow are we talking? Seconds? Minutes? Is it too slow? It's possible
> > that MAX_SAMPLES_PER_BIT=32 is a bit high as a threshold and I should
> > reduce that a bit.
> >
>
> TBH, I run out of patience and never seen it completes, more then seconds. I just was
> curious how much it is should take to get crng_ready() return true.

Ooof. Yea, running this in a VM with various settings I can see that
the current maximum is problematic. I'll fix that up and send a patch.

Jason
diff mbox series

Patch

diff --git a/drivers/char/random.c b/drivers/char/random.c
index bf89c6f27a19..94a2ddb53662 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1354,6 +1354,12 @@  void add_interrupt_randomness(int irq)
 }
 EXPORT_SYMBOL_GPL(add_interrupt_randomness);
 
+struct entropy_timer_state {
+	unsigned long entropy;
+	struct timer_list timer;
+	unsigned int samples, samples_per_bit;
+};
+
 /*
  * Each time the timer fires, we expect that we got an unpredictable
  * jump in the cycle counter. Even if the timer is running on another
@@ -1367,9 +1373,14 @@  EXPORT_SYMBOL_GPL(add_interrupt_randomness);
  *
  * So the re-arming always happens in the entropy loop itself.
  */
-static void entropy_timer(struct timer_list *t)
+static void entropy_timer(struct timer_list *timer)
 {
-	credit_entropy_bits(1);
+	struct entropy_timer_state *state = container_of(timer, struct entropy_timer_state, timer);
+
+	if (++state->samples == state->samples_per_bit) {
+		credit_entropy_bits(1);
+		state->samples = 0;
+	}
 }
 
 /*
@@ -1378,17 +1389,22 @@  static void entropy_timer(struct timer_list *t)
  */
 static void try_to_generate_entropy(void)
 {
-	struct {
-		unsigned long entropy;
-		struct timer_list timer;
-	} stack;
-
-	stack.entropy = random_get_entropy();
+	enum { NUM_TRIAL_SAMPLES = 8192, MAX_SAMPLES_PER_BIT = 256 };
+	struct entropy_timer_state stack;
+	unsigned int i, num_different = 0;
+	unsigned long last = random_get_entropy();
 
-	/* Slow counter - or none. Don't even bother */
-	if (stack.entropy == random_get_entropy())
+	for (i = 0; i < NUM_TRIAL_SAMPLES - 1; ++i) {
+		stack.entropy = random_get_entropy();
+		if (stack.entropy != last)
+			++num_different;
+		last = stack.entropy;
+	}
+	stack.samples_per_bit = DIV_ROUND_UP(NUM_TRIAL_SAMPLES, num_different + 1);
+	if (stack.samples_per_bit > MAX_SAMPLES_PER_BIT)
 		return;
 
+	stack.samples = 0;
 	timer_setup_on_stack(&stack.timer, entropy_timer, 0);
 	while (!crng_ready() && !signal_pending(current)) {
 		if (!timer_pending(&stack.timer))