diff mbox series

[v2] random: reseed more often immediately after booting

Message ID 20220309191850.1508953-1-Jason@zx2c4.com (mailing list archive)
State Not Applicable
Delegated to: Herbert Xu
Headers show
Series [v2] random: reseed more often immediately after booting | expand

Commit Message

Jason A. Donenfeld March 9, 2022, 7:18 p.m. UTC
In order to chip away at the "premature first" problem, we augment our
existing entropy accounting with increased reseedings at boot. The idea
is that at boot, we're getting entropy from various places, and we're
not very sure which of early boot entropy is good and which isn't. Even
when we're crediting the entropy, we're still not totally certain that
it's any good. Since boot is the one time (aside from a compromise) that
we have zero entropy, it's important that we shephard entropy into the
crng fairly often. At the same time, we don't want a "premature next"
problem, whereby an attacker can brute force individual bits of added
entropy. In lieu of going full-on Fortuna (for now), we can pick a
simpler strategy of just reseeding more often during the first 5 minutes
after boot. This is still bounded by the 256-bit entropy credit
requirement, so we'll skip a reseeding if we haven't reached that, but
in case entropy /is/ coming in, this ensures that it makes its way into
the crng rather rapidly during these early stages. For this we start at
5 seconds after boot, and double that interval until it's more than 5
minutes. After that, we then move to our normal schedule of reseeding
not more than once per 5 minutes.

Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
Changes v1->v2:
- Simplified arithmetic, prevented overflow.

 drivers/char/random.c | 27 ++++++++++++++++++++++++---
 1 file changed, 24 insertions(+), 3 deletions(-)

Comments

Eric Biggers March 10, 2022, 4:57 a.m. UTC | #1
Hi Jason,

On Wed, Mar 09, 2022 at 12:18:50PM -0700, Jason A. Donenfeld wrote:
> In order to chip away at the "premature first" problem, we augment our
> existing entropy accounting with increased reseedings at boot.

I'm very glad to see this; this is something that I've been concerned about.
I think this is basically the right solution until something more sophisticated
can be implemented (as you said).

A few comments below.

> The idea
> is that at boot, we're getting entropy from various places, and we're
> not very sure which of early boot entropy is good and which isn't. Even
> when we're crediting the entropy, we're still not totally certain that
> it's any good. Since boot is the one time (aside from a compromise) that
> we have zero entropy, it's important that we shephard entropy into the
> crng fairly often. At the same time, we don't want a "premature next"
> problem, whereby an attacker can brute force individual bits of added
> entropy. In lieu of going full-on Fortuna (for now), we can pick a
> simpler strategy of just reseeding more often during the first 5 minutes
> after boot. This is still bounded by the 256-bit entropy credit
> requirement, so we'll skip a reseeding if we haven't reached that, but
> in case entropy /is/ coming in, this ensures that it makes its way into
> the crng rather rapidly during these early stages. For this we start at
> 5 seconds after boot, and double that interval until it's more than 5
> minutes. After that, we then move to our normal schedule of reseeding
> not more than once per 5 minutes.

Break up the above into multiple paragraphs?

> +/*
> + * Return whether the crng seed is considered to be sufficiently
> + * old that a reseeding might be attempted. This is the case 5,
> + * 10, 20, 40, 80, and 160 seconds after boot, and after if the
> + * last reseeding was CRNG_RESEED_INTERVAL ago.
> + */
> +static bool crng_has_old_seed(void)
> +{
> +	static unsigned int next_init_secs = 5;
> +
> +	if (unlikely(next_init_secs < CRNG_RESEED_INTERVAL / HZ)) {

The read of 'next_init_secs' needs READ_ONCE(), since it can be written to
concurrently.

> +		unsigned int uptime = min_t(u64, INT_MAX, ktime_get_seconds());
> +		if (uptime >= READ_ONCE(next_init_secs)) {
> +			WRITE_ONCE(next_init_secs, 5U << fls(uptime / 5));
> +			return true;
> +		}
> +		return false;

The '5U << fls(uptime / 5)' expression is a little hard to understand, but it
appears to work as intended.

However, one thing that seems a bit odd is that this method can result in two
reseeds with very little time in between.  For example, if no one is using the
RNG from second 40-78, but then it is used in seconds 79-80, then it will be
reseeded at both seconds 79 and 80 if there is entropy available.

Perhaps the condition should still be:

	time_after(jiffies, READ_ONCE(base_crng.birth) + interval);

... as it is in the non-early case, but where 'interval' is a function of
'uptime' rather than always CRNG_RESEED_INTERVAL?  Maybe something like:

	interval = CRNG_RESEED_INTERVAL;
	if (uptime < 2 * CRNG_RESEED_INTERVAL / HZ)
		interval = max(5, uptime / 2) * HZ;

- Eric
Jason A. Donenfeld March 10, 2022, 8:59 p.m. UTC | #2
Hey Eric,

On Wed, Mar 9, 2022 at 9:57 PM Eric Biggers <ebiggers@kernel.org> wrote:
> > not more than once per 5 minutes.
>
> Break up the above into multiple paragraphs?

Will do.

>
> > +/*
> > + * Return whether the crng seed is considered to be sufficiently
> > + * old that a reseeding might be attempted. This is the case 5,
> > + * 10, 20, 40, 80, and 160 seconds after boot, and after if the
> > + * last reseeding was CRNG_RESEED_INTERVAL ago.
> > + */
> > +static bool crng_has_old_seed(void)
> > +{
> > +     static unsigned int next_init_secs = 5;
> > +
> > +     if (unlikely(next_init_secs < CRNG_RESEED_INTERVAL / HZ)) {
>
> The read of 'next_init_secs' needs READ_ONCE(), since it can be written to
> concurrently.

Thanks, will fix.

> However, one thing that seems a bit odd is that this method can result in two
> reseeds with very little time in between.  For example, if no one is using the
> RNG from second 40-78, but then it is used in seconds 79-80, then it will be
> reseeded at both seconds 79 and 80 if there is entropy available.

I've been sort of going back and forth on this. I think the idea is
that there are two relative time measurements. The ordinary one we use
is time since last reseeding. But at boot, we're trying to account for
the fact that entropy is coming in relative to the init process of the
system, which means we need it to be relative to boot time rather than
relative to the last reseeding. As you point out, this is a little
wonky with how things are now, because we only ever reseed on usage.
To get closer to what I have in mind, we could reseed in a timer, so
that it hits it exactly on the 5,10,20,40,etc dot. But that seems a
bit cumbersome and maybe unnecessary. The effect of the behavior of
this v1 you pointed out is:

- We might reseed at 79, and then fail to reseed at 80. Consequence:
we lose 1 second of entropy that could have made it for that try.
- We might reseed at 79, and then also reseed at 80 too. Consequence:
that's a fairly quick refresh, but on the other hand, we're still
requiring 256 bit credits, so maybe not so bad, and if we've got so
much entropy coming in during that small period of time, maybe it
really isn't a concern.

So I'm not sure either of these cases really matter that much.

> Perhaps the condition should still be:
>
>         time_after(jiffies, READ_ONCE(base_crng.birth) + interval);
>
> ... as it is in the non-early case, but where 'interval' is a function of
> 'uptime' rather than always CRNG_RESEED_INTERVAL?  Maybe something like:
>
>         interval = CRNG_RESEED_INTERVAL;
>         if (uptime < 2 * CRNG_RESEED_INTERVAL / HZ)
>                 interval = max(5, uptime / 2) * HZ;

I'd experimented with things like that, for example making it exponential:

  static bool early_boot = true;
  unsigned long interval = CRNG_RESEED_INTERVAL;

  if (unlikely(READ_ONCE(early_boot))) {
    unsigned int uptime = min_t(u64, INT_MAX, ktime_get_seconds());
    if (uptime >= CRNG_RESEED_INTERVAL / HZ)
      WRITE_ONCE(early_boot, false);
    else
      interval = (5U << fls(uptime / 5)) * HZ;
  }
  return time_after(jiffies, READ_ONCE(base_crng.birth) + interval);

But the whole thing of combining relative-to-last-reseed with
relative-to-boottime seems really strange. I'm not quite sure what
that's supposed to represent, whereas what I have in v1 is
"exponentially increasing intervals from boottime" which is fairly
easy to understand.

Jason
Eric Biggers March 12, 2022, 7:44 p.m. UTC | #3
On Thu, Mar 10, 2022 at 01:59:05PM -0700, Jason A. Donenfeld wrote:
> > However, one thing that seems a bit odd is that this method can result in two
> > reseeds with very little time in between.  For example, if no one is using the
> > RNG from second 40-78, but then it is used in seconds 79-80, then it will be
> > reseeded at both seconds 79 and 80 if there is entropy available.
> 
> I've been sort of going back and forth on this. I think the idea is
> that there are two relative time measurements. The ordinary one we use
> is time since last reseeding. But at boot, we're trying to account for
> the fact that entropy is coming in relative to the init process of the
> system, which means we need it to be relative to boot time rather than
> relative to the last reseeding. As you point out, this is a little
> wonky with how things are now, because we only ever reseed on usage.
> To get closer to what I have in mind, we could reseed in a timer, so
> that it hits it exactly on the 5,10,20,40,etc dot. But that seems a
> bit cumbersome and maybe unnecessary. The effect of the behavior of
> this v1 you pointed out is:
> 
> - We might reseed at 79, and then fail to reseed at 80. Consequence:
> we lose 1 second of entropy that could have made it for that try.
> - We might reseed at 79, and then also reseed at 80 too. Consequence:
> that's a fairly quick refresh, but on the other hand, we're still
> requiring 256 bit credits, so maybe not so bad, and if we've got so
> much entropy coming in during that small period of time, maybe it
> really isn't a concern.
> 
> So I'm not sure either of these cases really matter that much.
> 
> > Perhaps the condition should still be:
> >
> >         time_after(jiffies, READ_ONCE(base_crng.birth) + interval);
> >
> > ... as it is in the non-early case, but where 'interval' is a function of
> > 'uptime' rather than always CRNG_RESEED_INTERVAL?  Maybe something like:
> >
> >         interval = CRNG_RESEED_INTERVAL;
> >         if (uptime < 2 * CRNG_RESEED_INTERVAL / HZ)
> >                 interval = max(5, uptime / 2) * HZ;
> 
> I'd experimented with things like that, for example making it exponential:
> 
>   static bool early_boot = true;
>   unsigned long interval = CRNG_RESEED_INTERVAL;
> 
>   if (unlikely(READ_ONCE(early_boot))) {
>     unsigned int uptime = min_t(u64, INT_MAX, ktime_get_seconds());
>     if (uptime >= CRNG_RESEED_INTERVAL / HZ)
>       WRITE_ONCE(early_boot, false);
>     else
>       interval = (5U << fls(uptime / 5)) * HZ;
>   }
>   return time_after(jiffies, READ_ONCE(base_crng.birth) + interval);
> 
> But the whole thing of combining relative-to-last-reseed with
> relative-to-boottime seems really strange. I'm not quite sure what
> that's supposed to represent, whereas what I have in v1 is
> "exponentially increasing intervals from boottime" which is fairly
> easy to understand.

I don't think it's strange.  Maybe it seems strange because of how you wrote it
('interval = (5U << fls(uptime / 5)) * HZ'), where the reseed interval suddenly
jumps from X to 2*X seconds.  The version I suggested is 'interval = max(5,
uptime / 2) * HZ', which is smoother.  It's simply saying that the reseed
interval increases as the uptime increases, which seems to be what we want.
(Bounded by [5*HZ, CRNG_RESEED_INTERVAL], of course.)

Your current patch interacts badly with how crng_reseed() is a no-op if fewer
than 256 entropy bits are available.  Suppose that random bytes are requested at
seconds 10 and 15, and at second 10 there are 200 entropy bits available and at
second 15 there are 256.  With your current patch, the CRNG wouldn't be reseeded
before either request, since the no-op reseed at second 10 would consume the
only attempt for the whole interval of 10-19.  I.e. the request at second 10
would actually be preventing the CRNG from being reseeded when it should be.

Using an interval since the actual last reseed time (base_crng.birth) avoids
this quirk, as well the one I mentioned before where two reseeds can be done
with very little time in between.

What you have now is still better than the status quo, but I'm not sure it's the
best way.

- Eric
Jason A. Donenfeld March 13, 2022, 12:35 a.m. UTC | #4
Hey Eric,

On Sat, Mar 12, 2022 at 12:44 PM Eric Biggers <ebiggers@kernel.org> wrote:
> I don't think it's strange.  Maybe it seems strange because of how you wrote it
> ('interval = (5U << fls(uptime / 5)) * HZ'), where the reseed interval suddenly
> jumps from X to 2*X seconds.  The version I suggested is 'interval = max(5,
> uptime / 2) * HZ', which is smoother.  It's simply saying that the reseed
> interval increases as the uptime increases, which seems to be what we want.
> (Bounded by [5*HZ, CRNG_RESEED_INTERVAL], of course.)
> What you have now is still better than the status quo, but I'm not sure it's the
> best way.

To be clear, I'm not opposed to your suggestion. I just don't
understand it... yet. I've been playing around with this python script
to try to "see" what it's doing:

```
#!/usr/bin/env python3
import sys

stride = int(sys.argv[1])

lastyes = 0

def e(x):
    return max(5, x / 2)

def f(x):
    global lastyes
    if lastyes + e(x) - x < 0:
        lastyes = x
        return True
    return False

li = 0
for i in range(0, 300, stride):
    if f(i):
        print(i, i - li)
        li = i
```

And I can sort of see that for constant calls, it doubles the
frequency as you'd expect. But I still don't see how this is related
to system uptime in some definite way. The reason for having a
heuristic like this in the first place is that we are assuming that
there's some (inverse) correlation between the need for entropy and
the system boot time, and another correlation between the availability
of entropy and the system boot time. I'm just not "getting" how your
formula correlates to that. I'm not saying it's wrong, but just that I
might be a bit slow in grasping the relation. Can you give some more
details on what's happening? I'll continue to stare at it and poke
around with my python script of course, but if you could help that'd
be appreciated.

Alternatively, I had mentioned and then dismissed the timer approach
earlier, but actually maybe that'd be not as bad as I originally
thought? Just having a timer run at 5,10,20,40,80,160 or something
like that? Do you share my original allergy to that idea, or might
that actually be an okay way forward?

Jason
Eric Biggers March 13, 2022, 1 a.m. UTC | #5
On Sat, Mar 12, 2022 at 05:35:08PM -0700, Jason A. Donenfeld wrote:
> Hey Eric,
> 
> On Sat, Mar 12, 2022 at 12:44 PM Eric Biggers <ebiggers@kernel.org> wrote:
> > I don't think it's strange.  Maybe it seems strange because of how you wrote it
> > ('interval = (5U << fls(uptime / 5)) * HZ'), where the reseed interval suddenly
> > jumps from X to 2*X seconds.  The version I suggested is 'interval = max(5,
> > uptime / 2) * HZ', which is smoother.  It's simply saying that the reseed
> > interval increases as the uptime increases, which seems to be what we want.
> > (Bounded by [5*HZ, CRNG_RESEED_INTERVAL], of course.)
> > What you have now is still better than the status quo, but I'm not sure it's the
> > best way.
> 
> To be clear, I'm not opposed to your suggestion. I just don't
> understand it... yet. I've been playing around with this python script
> to try to "see" what it's doing:
> 
> ```
> #!/usr/bin/env python3
> import sys
> 
> stride = int(sys.argv[1])
> 
> lastyes = 0
> 
> def e(x):
>     return max(5, x / 2)
> 
> def f(x):
>     global lastyes
>     if lastyes + e(x) - x < 0:
>         lastyes = x
>         return True
>     return False
> 
> li = 0
> for i in range(0, 300, stride):
>     if f(i):
>         print(i, i - li)
>         li = i
> ```
> 
> And I can sort of see that for constant calls, it doubles the
> frequency as you'd expect. But I still don't see how this is related
> to system uptime in some definite way. The reason for having a
> heuristic like this in the first place is that we are assuming that
> there's some (inverse) correlation between the need for entropy and
> the system boot time, and another correlation between the availability
> of entropy and the system boot time. I'm just not "getting" how your
> formula correlates to that. I'm not saying it's wrong, but just that I
> might be a bit slow in grasping the relation. Can you give some more
> details on what's happening? I'll continue to stare at it and poke
> around with my python script of course, but if you could help that'd
> be appreciated.

It's just increasing the reseed interval linearly with the uptime, with constant
factor 0.5.  So if the last reseed happened at uptime=t, then the next reseed
will happen on the first request made with uptime >= 2*t.

> 
> Alternatively, I had mentioned and then dismissed the timer approach
> earlier, but actually maybe that'd be not as bad as I originally
> thought? Just having a timer run at 5,10,20,40,80,160 or something
> like that? Do you share my original allergy to that idea, or might
> that actually be an okay way forward?

That seems strictly worse than what you have now (though still better than the
status quo, of course).  The only motivation I can see is if we'd want to avoid
reseeds during the requests themselves for performance reasons.  It seems that
that's not really an issue, though?

- Eric
Jason A. Donenfeld March 13, 2022, 1:29 a.m. UTC | #6
Hey Eric,

On Sat, Mar 12, 2022 at 6:01 PM Eric Biggers <ebiggers@kernel.org> wrote:
> It's just increasing the reseed interval linearly with the uptime, with constant
> factor 0.5.  So if the last reseed happened at uptime=t, then the next reseed
> will happen on the first request made with uptime >= 2*t.

Thanks, okay, I think I get it. The uptime of the next reseeding will
be at least double the uptime of the prior. Let's give it a try your
way. I'll send a v+1 shortly.

Jason
diff mbox series

Patch

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 0ceda9a12bfe..8c08186205f4 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -333,6 +333,27 @@  static void crng_fast_key_erasure(u8 key[CHACHA_KEY_SIZE],
 	memzero_explicit(first_block, sizeof(first_block));
 }
 
+/*
+ * Return whether the crng seed is considered to be sufficiently
+ * old that a reseeding might be attempted. This is the case 5,
+ * 10, 20, 40, 80, and 160 seconds after boot, and after if the
+ * last reseeding was CRNG_RESEED_INTERVAL ago.
+ */
+static bool crng_has_old_seed(void)
+{
+	static unsigned int next_init_secs = 5;
+
+	if (unlikely(next_init_secs < CRNG_RESEED_INTERVAL / HZ)) {
+		unsigned int uptime = min_t(u64, INT_MAX, ktime_get_seconds());
+		if (uptime >= READ_ONCE(next_init_secs)) {
+			WRITE_ONCE(next_init_secs, 5U << fls(uptime / 5));
+			return true;
+		}
+		return false;
+	}
+	return time_after(jiffies, READ_ONCE(base_crng.birth) + CRNG_RESEED_INTERVAL);
+}
+
 /*
  * This function returns a ChaCha state that you may use for generating
  * random data. It also returns up to 32 bytes on its own of random data
@@ -366,10 +387,10 @@  static void crng_make_state(u32 chacha_state[CHACHA_STATE_WORDS],
 	}
 
 	/*
-	 * If the base_crng is more than 5 minutes old, we reseed, which
-	 * in turn bumps the generation counter that we check below.
+	 * If the base_crng is old enough, we try to reseed, which in turn
+	 * bumps the generation counter that we check below.
 	 */
-	if (unlikely(time_after(jiffies, READ_ONCE(base_crng.birth) + CRNG_RESEED_INTERVAL)))
+	if (unlikely(crng_has_old_seed()))
 		crng_reseed(false);
 
 	local_lock_irqsave(&crngs.lock, flags);