Message ID | 20220223185511.628452-1-Jason@zx2c4.com (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Herbert Xu |
Headers | show |
Series | random: do crng pre-init loading in worker rather than irq | expand |
Am Wed, Feb 23, 2022 at 07:55:11PM +0100 schrieb Jason A. Donenfeld: > Taking spinlocks from IRQ context is problematic for PREEMPT_RT. That > is, in part, why we take trylocks instead. But apparently this still > trips up various lock dependency analyzers. That seems like a bug in the > analyzers that should be fixed, rather than having to change things > here. > > But maybe there's another reason to change things up: by deferring the > crng pre-init loading to the worker, we can use the cryptographic hash > function rather than xor, which is perhaps a meaningful difference when > considering this data has only been through the relatively weak > fast_mix() function. > > The biggest downside of this approach is that the pre-init loading is > now deferred until later, which means things that need random numbers > after interrupts are enabled, but before workqueues are running -- or > before this particular worker manages to run -- are going to get into > trouble. Hopefully in the real world, this window is rather small, > especially since this code won't run until 64 interrupts had occurred. > > Cc: Dominik Brodowski <linux@dominikbrodowski.net> > Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > Cc: Sultan Alsawaf <sultan@kerneltoast.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Theodore Ts'o <tytso@mit.edu> > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> > --- > drivers/char/random.c | 62 ++++++++++++------------------------------- > 1 file changed, 17 insertions(+), 45 deletions(-) > > diff --git a/drivers/char/random.c b/drivers/char/random.c > index 536237a0f073..9fb06fc298d3 100644 > --- a/drivers/char/random.c > +++ b/drivers/char/random.c > @@ -1298,7 +1278,12 @@ static void mix_interrupt_randomness(struct work_struct *work) > local_irq_enable(); > > mix_pool_bytes(pool, sizeof(pool)); > - credit_entropy_bits(1); > + > + if (unlikely(crng_init == 0)) > + crng_pre_init_inject(pool, sizeof(pool), true); > + else > + credit_entropy_bits(1); > + > memzero_explicit(pool, sizeof(pool)); > } Might it make sense to call crng_pre_init_inject() before mix_pool_bytes? Otherwise, all looks fine: Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net> Thanks Dominik
On 2/24/22, Dominik Brodowski <linux@dominikbrodowski.net> wrote: > Am Wed, Feb 23, 2022 at 07:55:11PM +0100 schrieb Jason A. Donenfeld: >> Taking spinlocks from IRQ context is problematic for PREEMPT_RT. That >> is, in part, why we take trylocks instead. But apparently this still >> trips up various lock dependency analyzers. That seems like a bug in the >> analyzers that should be fixed, rather than having to change things >> here. >> >> But maybe there's another reason to change things up: by deferring the >> crng pre-init loading to the worker, we can use the cryptographic hash >> function rather than xor, which is perhaps a meaningful difference when >> considering this data has only been through the relatively weak >> fast_mix() function. >> >> The biggest downside of this approach is that the pre-init loading is >> now deferred until later, which means things that need random numbers >> after interrupts are enabled, but before workqueues are running -- or >> before this particular worker manages to run -- are going to get into >> trouble. Hopefully in the real world, this window is rather small, >> especially since this code won't run until 64 interrupts had occurred. >> >> Cc: Dominik Brodowski <linux@dominikbrodowski.net> >> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> >> Cc: Sultan Alsawaf <sultan@kerneltoast.com> >> Cc: Thomas Gleixner <tglx@linutronix.de> >> Cc: Peter Zijlstra <peterz@infradead.org> >> Cc: Theodore Ts'o <tytso@mit.edu> >> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> >> --- >> drivers/char/random.c | 62 ++++++++++++------------------------------- >> 1 file changed, 17 insertions(+), 45 deletions(-) >> >> diff --git a/drivers/char/random.c b/drivers/char/random.c >> index 536237a0f073..9fb06fc298d3 100644 >> --- a/drivers/char/random.c >> +++ b/drivers/char/random.c >> @@ -1298,7 +1278,12 @@ static void mix_interrupt_randomness(struct >> work_struct *work) >> local_irq_enable(); >> >> mix_pool_bytes(pool, sizeof(pool)); >> - credit_entropy_bits(1); >> + >> + if (unlikely(crng_init == 0)) >> + crng_pre_init_inject(pool, sizeof(pool), true); >> + else >> + credit_entropy_bits(1); >> + >> memzero_explicit(pool, sizeof(pool)); >> } > > Might it make sense to call crng_pre_init_inject() before mix_pool_bytes? What exactly is the difference you see mattering in the order? I keep chasing my tail trying to think about it. Jason
CC +Eric On Wed, Feb 23, 2022 at 7:55 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > Taking spinlocks from IRQ context is problematic for PREEMPT_RT. That > is, in part, why we take trylocks instead. But apparently this still > trips up various lock dependency analyzers. That seems like a bug in the > analyzers that should be fixed, rather than having to change things > here. > > But maybe there's another reason to change things up: by deferring the > crng pre-init loading to the worker, we can use the cryptographic hash > function rather than xor, which is perhaps a meaningful difference when > considering this data has only been through the relatively weak > fast_mix() function. > > The biggest downside of this approach is that the pre-init loading is > now deferred until later, which means things that need random numbers > after interrupts are enabled, but before workqueues are running -- or > before this particular worker manages to run -- are going to get into > trouble. Hopefully in the real world, this window is rather small, > especially since this code won't run until 64 interrupts had occurred. > > Cc: Dominik Brodowski <linux@dominikbrodowski.net> > Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > Cc: Sultan Alsawaf <sultan@kerneltoast.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Theodore Ts'o <tytso@mit.edu> > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> > --- > drivers/char/random.c | 62 ++++++++++++------------------------------- > 1 file changed, 17 insertions(+), 45 deletions(-) > > diff --git a/drivers/char/random.c b/drivers/char/random.c > index 536237a0f073..9fb06fc298d3 100644 > --- a/drivers/char/random.c > +++ b/drivers/char/random.c > @@ -443,10 +443,6 @@ static void crng_make_state(u32 chacha_state[CHACHA_STATE_WORDS], > * boot time when it's better to have something there rather than > * nothing. > * > - * There are two paths, a slow one and a fast one. The slow one > - * hashes the input along with the current key. The fast one simply > - * xors it in, and should only be used from interrupt context. > - * > * If account is set, then the crng_init_cnt counter is incremented. > * This shouldn't be set by functions like add_device_randomness(), > * where we can't trust the buffer passed to it is guaranteed to be > @@ -455,19 +451,15 @@ static void crng_make_state(u32 chacha_state[CHACHA_STATE_WORDS], > * Returns the number of bytes processed from input, which is bounded > * by CRNG_INIT_CNT_THRESH if account is true. > */ > -static size_t crng_pre_init_inject(const void *input, size_t len, > - bool fast, bool account) > +static size_t crng_pre_init_inject(const void *input, size_t len, bool account) > { > static int crng_init_cnt = 0; > + struct blake2s_state hash; > unsigned long flags; > > - if (fast) { > - if (!spin_trylock_irqsave(&base_crng.lock, flags)) > - return 0; > - } else { > - spin_lock_irqsave(&base_crng.lock, flags); > - } > + blake2s_init(&hash, sizeof(base_crng.key)); > > + spin_lock_irqsave(&base_crng.lock, flags); > if (crng_init != 0) { > spin_unlock_irqrestore(&base_crng.lock, flags); > return 0; > @@ -476,21 +468,9 @@ static size_t crng_pre_init_inject(const void *input, size_t len, > if (account) > len = min_t(size_t, len, CRNG_INIT_CNT_THRESH - crng_init_cnt); > > - if (fast) { > - const u8 *src = input; > - size_t i; > - > - for (i = 0; i < len; ++i) > - base_crng.key[(crng_init_cnt + i) % > - sizeof(base_crng.key)] ^= src[i]; > - } else { > - struct blake2s_state hash; > - > - blake2s_init(&hash, sizeof(base_crng.key)); > - blake2s_update(&hash, base_crng.key, sizeof(base_crng.key)); > - blake2s_update(&hash, input, len); > - blake2s_final(&hash, base_crng.key); > - } > + blake2s_update(&hash, base_crng.key, sizeof(base_crng.key)); > + blake2s_update(&hash, input, len); > + blake2s_final(&hash, base_crng.key); > > if (account) { > crng_init_cnt += len; > @@ -1040,7 +1020,7 @@ void add_device_randomness(const void *buf, size_t size) > unsigned long flags; > > if (crng_init == 0 && size) > - crng_pre_init_inject(buf, size, false, false); > + crng_pre_init_inject(buf, size, false); > > spin_lock_irqsave(&input_pool.lock, flags); > _mix_pool_bytes(buf, size); > @@ -1157,7 +1137,7 @@ void add_hwgenerator_randomness(const void *buffer, size_t count, > size_t entropy) > { > if (unlikely(crng_init == 0)) { > - size_t ret = crng_pre_init_inject(buffer, count, false, true); > + size_t ret = crng_pre_init_inject(buffer, count, true); > mix_pool_bytes(buffer, ret); > count -= ret; > buffer += ret; > @@ -1298,7 +1278,12 @@ static void mix_interrupt_randomness(struct work_struct *work) > local_irq_enable(); > > mix_pool_bytes(pool, sizeof(pool)); > - credit_entropy_bits(1); > + > + if (unlikely(crng_init == 0)) > + crng_pre_init_inject(pool, sizeof(pool), true); > + else > + credit_entropy_bits(1); > + > memzero_explicit(pool, sizeof(pool)); > } > > @@ -1331,24 +1316,11 @@ void add_interrupt_randomness(int irq) > fast_mix(fast_pool->pool32); > new_count = ++fast_pool->count; > > - if (unlikely(crng_init == 0)) { > - if (new_count >= 64 && > - crng_pre_init_inject(fast_pool->pool32, sizeof(fast_pool->pool32), > - true, true) > 0) { > - fast_pool->count = 0; > - fast_pool->last = now; > - if (spin_trylock(&input_pool.lock)) { > - _mix_pool_bytes(&fast_pool->pool32, sizeof(fast_pool->pool32)); > - spin_unlock(&input_pool.lock); > - } > - } > - return; > - } > - > if (new_count & MIX_INFLIGHT) > return; > > - if (new_count < 64 && !time_after(now, fast_pool->last + HZ)) > + if (new_count < 64 && (!time_after(now, fast_pool->last + HZ) || > + unlikely(crng_init == 0))) > return; > > if (unlikely(!fast_pool->mix.func)) > -- > 2.35.1 > FYI, I think you were concerned about those trylocks too. This should make that go away. Jason
Am Thu, Feb 24, 2022 at 10:49:12AM +0100 schrieb Jason A. Donenfeld: > On 2/24/22, Dominik Brodowski <linux@dominikbrodowski.net> wrote: > > Am Wed, Feb 23, 2022 at 07:55:11PM +0100 schrieb Jason A. Donenfeld: > >> Taking spinlocks from IRQ context is problematic for PREEMPT_RT. That > >> is, in part, why we take trylocks instead. But apparently this still > >> trips up various lock dependency analyzers. That seems like a bug in the > >> analyzers that should be fixed, rather than having to change things > >> here. > >> > >> But maybe there's another reason to change things up: by deferring the > >> crng pre-init loading to the worker, we can use the cryptographic hash > >> function rather than xor, which is perhaps a meaningful difference when > >> considering this data has only been through the relatively weak > >> fast_mix() function. > >> > >> The biggest downside of this approach is that the pre-init loading is > >> now deferred until later, which means things that need random numbers > >> after interrupts are enabled, but before workqueues are running -- or > >> before this particular worker manages to run -- are going to get into > >> trouble. Hopefully in the real world, this window is rather small, > >> especially since this code won't run until 64 interrupts had occurred. > >> > >> Cc: Dominik Brodowski <linux@dominikbrodowski.net> > >> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > >> Cc: Sultan Alsawaf <sultan@kerneltoast.com> > >> Cc: Thomas Gleixner <tglx@linutronix.de> > >> Cc: Peter Zijlstra <peterz@infradead.org> > >> Cc: Theodore Ts'o <tytso@mit.edu> > >> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> > >> --- > >> drivers/char/random.c | 62 ++++++++++++------------------------------- > >> 1 file changed, 17 insertions(+), 45 deletions(-) > >> > >> diff --git a/drivers/char/random.c b/drivers/char/random.c > >> index 536237a0f073..9fb06fc298d3 100644 > >> --- a/drivers/char/random.c > >> +++ b/drivers/char/random.c > >> @@ -1298,7 +1278,12 @@ static void mix_interrupt_randomness(struct > >> work_struct *work) > >> local_irq_enable(); > >> > >> mix_pool_bytes(pool, sizeof(pool)); > >> - credit_entropy_bits(1); > >> + > >> + if (unlikely(crng_init == 0)) > >> + crng_pre_init_inject(pool, sizeof(pool), true); > >> + else > >> + credit_entropy_bits(1); > >> + > >> memzero_explicit(pool, sizeof(pool)); > >> } > > > > Might it make sense to call crng_pre_init_inject() before mix_pool_bytes? > > What exactly is the difference you see mattering in the order? I keep > chasing my tail trying to think about it. We had that order beforehand -- and even if it probably doesn't matter, this means crng_pre_init_inject() gets called a tiny bit earlier. That means there's a chance to progres to crng_init=1 a tiny bit earlier as well. Thanks, Dominik
On Thu, Feb 24, 2022 at 4:11 PM Dominik Brodowski <linux@dominikbrodowski.net> wrote: > > Am Thu, Feb 24, 2022 at 10:49:12AM +0100 schrieb Jason A. Donenfeld: > > On 2/24/22, Dominik Brodowski <linux@dominikbrodowski.net> wrote: > > > Am Wed, Feb 23, 2022 at 07:55:11PM +0100 schrieb Jason A. Donenfeld: > > >> Taking spinlocks from IRQ context is problematic for PREEMPT_RT. That > > >> is, in part, why we take trylocks instead. But apparently this still > > >> trips up various lock dependency analyzers. That seems like a bug in the > > >> analyzers that should be fixed, rather than having to change things > > >> here. > > >> > > >> But maybe there's another reason to change things up: by deferring the > > >> crng pre-init loading to the worker, we can use the cryptographic hash > > >> function rather than xor, which is perhaps a meaningful difference when > > >> considering this data has only been through the relatively weak > > >> fast_mix() function. > > >> > > >> The biggest downside of this approach is that the pre-init loading is > > >> now deferred until later, which means things that need random numbers > > >> after interrupts are enabled, but before workqueues are running -- or > > >> before this particular worker manages to run -- are going to get into > > >> trouble. Hopefully in the real world, this window is rather small, > > >> especially since this code won't run until 64 interrupts had occurred. > > >> > > >> Cc: Dominik Brodowski <linux@dominikbrodowski.net> > > >> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > > >> Cc: Sultan Alsawaf <sultan@kerneltoast.com> > > >> Cc: Thomas Gleixner <tglx@linutronix.de> > > >> Cc: Peter Zijlstra <peterz@infradead.org> > > >> Cc: Theodore Ts'o <tytso@mit.edu> > > >> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> > > >> --- > > >> drivers/char/random.c | 62 ++++++++++++------------------------------- > > >> 1 file changed, 17 insertions(+), 45 deletions(-) > > >> > > >> diff --git a/drivers/char/random.c b/drivers/char/random.c > > >> index 536237a0f073..9fb06fc298d3 100644 > > >> --- a/drivers/char/random.c > > >> +++ b/drivers/char/random.c > > >> @@ -1298,7 +1278,12 @@ static void mix_interrupt_randomness(struct > > >> work_struct *work) > > >> local_irq_enable(); > > >> > > >> mix_pool_bytes(pool, sizeof(pool)); > > >> - credit_entropy_bits(1); > > >> + > > >> + if (unlikely(crng_init == 0)) > > >> + crng_pre_init_inject(pool, sizeof(pool), true); > > >> + else > > >> + credit_entropy_bits(1); > > >> + > > >> memzero_explicit(pool, sizeof(pool)); > > >> } > > > > > > Might it make sense to call crng_pre_init_inject() before mix_pool_bytes? > > > > What exactly is the difference you see mattering in the order? I keep > > chasing my tail trying to think about it. > > We had that order beforehand -- and even if it probably doesn't matter, this > means crng_pre_init_inject() gets called a tiny bit earlier. That means > there's a chance to progres to crng_init=1 a tiny bit earlier as well. Alright, I'll send a v2.
diff --git a/drivers/char/random.c b/drivers/char/random.c index 536237a0f073..9fb06fc298d3 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -443,10 +443,6 @@ static void crng_make_state(u32 chacha_state[CHACHA_STATE_WORDS], * boot time when it's better to have something there rather than * nothing. * - * There are two paths, a slow one and a fast one. The slow one - * hashes the input along with the current key. The fast one simply - * xors it in, and should only be used from interrupt context. - * * If account is set, then the crng_init_cnt counter is incremented. * This shouldn't be set by functions like add_device_randomness(), * where we can't trust the buffer passed to it is guaranteed to be @@ -455,19 +451,15 @@ static void crng_make_state(u32 chacha_state[CHACHA_STATE_WORDS], * Returns the number of bytes processed from input, which is bounded * by CRNG_INIT_CNT_THRESH if account is true. */ -static size_t crng_pre_init_inject(const void *input, size_t len, - bool fast, bool account) +static size_t crng_pre_init_inject(const void *input, size_t len, bool account) { static int crng_init_cnt = 0; + struct blake2s_state hash; unsigned long flags; - if (fast) { - if (!spin_trylock_irqsave(&base_crng.lock, flags)) - return 0; - } else { - spin_lock_irqsave(&base_crng.lock, flags); - } + blake2s_init(&hash, sizeof(base_crng.key)); + spin_lock_irqsave(&base_crng.lock, flags); if (crng_init != 0) { spin_unlock_irqrestore(&base_crng.lock, flags); return 0; @@ -476,21 +468,9 @@ static size_t crng_pre_init_inject(const void *input, size_t len, if (account) len = min_t(size_t, len, CRNG_INIT_CNT_THRESH - crng_init_cnt); - if (fast) { - const u8 *src = input; - size_t i; - - for (i = 0; i < len; ++i) - base_crng.key[(crng_init_cnt + i) % - sizeof(base_crng.key)] ^= src[i]; - } else { - struct blake2s_state hash; - - blake2s_init(&hash, sizeof(base_crng.key)); - blake2s_update(&hash, base_crng.key, sizeof(base_crng.key)); - blake2s_update(&hash, input, len); - blake2s_final(&hash, base_crng.key); - } + blake2s_update(&hash, base_crng.key, sizeof(base_crng.key)); + blake2s_update(&hash, input, len); + blake2s_final(&hash, base_crng.key); if (account) { crng_init_cnt += len; @@ -1040,7 +1020,7 @@ void add_device_randomness(const void *buf, size_t size) unsigned long flags; if (crng_init == 0 && size) - crng_pre_init_inject(buf, size, false, false); + crng_pre_init_inject(buf, size, false); spin_lock_irqsave(&input_pool.lock, flags); _mix_pool_bytes(buf, size); @@ -1157,7 +1137,7 @@ void add_hwgenerator_randomness(const void *buffer, size_t count, size_t entropy) { if (unlikely(crng_init == 0)) { - size_t ret = crng_pre_init_inject(buffer, count, false, true); + size_t ret = crng_pre_init_inject(buffer, count, true); mix_pool_bytes(buffer, ret); count -= ret; buffer += ret; @@ -1298,7 +1278,12 @@ static void mix_interrupt_randomness(struct work_struct *work) local_irq_enable(); mix_pool_bytes(pool, sizeof(pool)); - credit_entropy_bits(1); + + if (unlikely(crng_init == 0)) + crng_pre_init_inject(pool, sizeof(pool), true); + else + credit_entropy_bits(1); + memzero_explicit(pool, sizeof(pool)); } @@ -1331,24 +1316,11 @@ void add_interrupt_randomness(int irq) fast_mix(fast_pool->pool32); new_count = ++fast_pool->count; - if (unlikely(crng_init == 0)) { - if (new_count >= 64 && - crng_pre_init_inject(fast_pool->pool32, sizeof(fast_pool->pool32), - true, true) > 0) { - fast_pool->count = 0; - fast_pool->last = now; - if (spin_trylock(&input_pool.lock)) { - _mix_pool_bytes(&fast_pool->pool32, sizeof(fast_pool->pool32)); - spin_unlock(&input_pool.lock); - } - } - return; - } - if (new_count & MIX_INFLIGHT) return; - if (new_count < 64 && !time_after(now, fast_pool->last + HZ)) + if (new_count < 64 && (!time_after(now, fast_pool->last + HZ) || + unlikely(crng_init == 0))) return; if (unlikely(!fast_pool->mix.func))
Taking spinlocks from IRQ context is problematic for PREEMPT_RT. That is, in part, why we take trylocks instead. But apparently this still trips up various lock dependency analyzers. That seems like a bug in the analyzers that should be fixed, rather than having to change things here. But maybe there's another reason to change things up: by deferring the crng pre-init loading to the worker, we can use the cryptographic hash function rather than xor, which is perhaps a meaningful difference when considering this data has only been through the relatively weak fast_mix() function. The biggest downside of this approach is that the pre-init loading is now deferred until later, which means things that need random numbers after interrupts are enabled, but before workqueues are running -- or before this particular worker manages to run -- are going to get into trouble. Hopefully in the real world, this window is rather small, especially since this code won't run until 64 interrupts had occurred. Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Sultan Alsawaf <sultan@kerneltoast.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> --- drivers/char/random.c | 62 ++++++++++++------------------------------- 1 file changed, 17 insertions(+), 45 deletions(-)