diff mbox series

[v4,2/2] random: defer fast pool mixing to worker

Message ID 20220209125644.533876-3-Jason@zx2c4.com (mailing list archive)
State Not Applicable
Delegated to: Herbert Xu
Headers show
Series random: PREEMPT_RT fixes | expand

Commit Message

Jason A. Donenfeld Feb. 9, 2022, 12:56 p.m. UTC
On PREEMPT_RT, it's problematic to take spinlocks from hard irq
handlers. We can fix this by deferring to a work queue the dumping of
the fast pool into the input pool.

We accomplish this with some careful rules on fast_pool->count:

  - When it's incremented to >= 64, we schedule the work.
  - If the top bit is set, we never schedule the work, even if >= 64.
  - The worker is responsible for setting it back to 0 when it's done.

In the worst case, an irq handler is mixing a new irq into the pool at
the same time as the worker is dumping it into the input pool. In this
case, we only ever set the count back to 0 _after_ we're done, so that
subsequent cycles will require a full 64 to dump it in again. In other
words, the result of this race is only ever adding a little bit more
information than normal, but never less, and never crediting any more
for this partial additional information.

Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Sultan Alsawaf <sultan@kerneltoast.com>
Cc: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
 drivers/char/random.c         | 54 ++++++++++++++++++++++-------------
 include/trace/events/random.h |  6 ----
 2 files changed, 34 insertions(+), 26 deletions(-)

Comments

Sebastian Andrzej Siewior Feb. 10, 2022, 6:04 p.m. UTC | #1
On 2022-02-09 13:56:44 [+0100], Jason A. Donenfeld wrote:
> +static void mix_interrupt_randomness(struct work_struct *work)
> +{
> +	struct fast_pool *fast_pool = container_of(work, struct fast_pool, mix);
> +	u8 pool[sizeof(fast_pool->pool)];

So.
- CPU1 schedules a worker
- CPU1 goes offline before the gets on the CPU.
- The worker runs CPU2
- CPU2 is back online
- and now
   CPU1						CPU2
   new_count = ++fast_pool->count;
    reg = fast_pool->count (FAST_POOL_MIX_INFLIGHT | 64)
    incl reg (FAST_POOL_MIX_INFLIGHT | 65)
    						WRITE_ONCE(fast_pool->count, 0);
    fast_pool->count = reg ((FAST_POOL_MIX_INFLIGHT | 65)

So we lost the WRITE_ONCE(, 0), FAST_POOL_MIX_INFLIGHT is still set and
worker is not scheduled. Not easy to trigger, not by an ordinary user.
Just wanted to mention…

…
> @@ -999,9 +1016,10 @@ void add_interrupt_randomness(int irq)
>  
>  	fast_mix(fast_pool);
>  	add_interrupt_bench(cycles);
> +	new_count = ++fast_pool->count;
>  
>  	if (unlikely(crng_init == 0)) {
> -		if ((fast_pool->count >= 64) &&
> +		if (new_count >= 64 &&
>  		    crng_fast_load((u8 *)fast_pool->pool, sizeof(fast_pool->pool)) > 0) {

crng_fast_load() does spin_trylock_irqsave() in hardirq context. It does
not produce any warning on RT but is still wrong IMHO:
- lockdep will see a random task and I remember in the past it produced
  strange lock chains based on this.

- Should another task attempt to acquire this lock then it will PI-boost the
  wrong task.

If we just could move this, too.

I don't know how timing critical this is but the first backtrace from
crng_fast_load() came (to my surprise) from hwrng_fillfn() (a kthread)
and added 64bytes in one go.

I did move that crng_fast_load() into the worker and did made some
numbers:
           <idle>-0       [000] d..h1..     2.069924: add_interrupt_randomness: Tick

first interrupt
…
        swapper/0-1       [000] d..h.11     2.341938: add_interrupt_randomness: Tick
        swapper/0-1       [000] d..h.11     2.341938: add_interrupt_randomness: work

the 64th interrupt, scheduling the worker.

        swapper/0-1       [000] d..h.11     2.345937: add_interrupt_randomness: Tick
        swapper/0-1       [000] d..h111     2.349938: add_interrupt_randomness: Tick
        swapper/0-1       [000] d..h.11     2.353939: add_interrupt_randomness: Tick
        swapper/0-1       [000] d..h.11     2.357940: add_interrupt_randomness: Tick
        swapper/0-1       [000] d..h111     2.361939: add_interrupt_randomness: Tick
        swapper/0-1       [000] d..h111     2.365939: add_interrupt_randomness: Tick
        swapper/0-1       [000] d..h.11     2.369941: add_interrupt_randomness: Tick
     kworker/0:0H-6       [000] .......     2.384714: mix_interrupt_randomness: load
     kworker/0:0H-6       [000] .......     2.384715: crng_fast_load: 16
           <idle>-0       [001] dn.h1..     3.205766: add_interrupt_randomness: Tick
           <idle>-0       [019] dn.h1..     6.771047: add_interrupt_randomness: Tick

7 interrupts got lost before the worker could run & load first 16 bytes.
The workqueue core gets initialized at that point and spawns first
worker. After that the interrupts took a break.
And then the work-to-load delay was quite low:

           <idle>-0       [019] dn.h1..     7.586234: add_interrupt_randomness: Tick
           <idle>-0       [019] dn.h1..     7.586234: add_interrupt_randomness: work
    kworker/19:0H-175     [019] .......     7.586504: mix_interrupt_randomness: load
    kworker/19:0H-175     [019] .......     7.586507: crng_fast_load: 16
           <idle>-0       [020] dn.h1..     7.614649: add_interrupt_randomness: Tick
           <idle>-0       [020] dn.h1..     7.614651: add_interrupt_randomness: work
           <idle>-0       [020] dn.h1..     7.614736: add_interrupt_randomness: Tick
    kworker/20:0H-183     [020] dn.h...     7.614859: add_interrupt_randomness: Tick
    kworker/20:0H-183     [020] .......     7.614871: mix_interrupt_randomness: load
    kworker/20:0H-183     [020] .......     7.614872: crng_fast_load: 16
           <idle>-0       [018] dn.h1..     8.352423: add_interrupt_randomness: Tick
           <idle>-0       [018] dn.h1..     8.352423: add_interrupt_randomness: work
    kworker/18:0H-167     [018] dn.h1..     8.352438: add_interrupt_randomness: Tick
    kworker/18:0H-167     [018] dn.h1..     8.352448: add_interrupt_randomness: Tick
    kworker/18:0H-167     [018] dn.h1..     8.352459: add_interrupt_randomness: Tick
    kworker/18:0H-167     [018] dn.h1..     8.352491: add_interrupt_randomness: Tick
    kworker/18:0H-167     [018] .......     8.352505: mix_interrupt_randomness: load
    kworker/18:0H-167     [018] .......     8.352506: crng_fast_load: 16

In total we lost 13 ticks.
I did the same test on PREEMPT_VOLUNTARY and lost 2 ticks only.

Sebastian
Jason A. Donenfeld Feb. 11, 2022, 12:42 a.m. UTC | #2
Hi Sebastian,

On Thu, Feb 10, 2022 at 7:04 PM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
> So.
> - CPU1 schedules a worker
> - CPU1 goes offline before the gets on the CPU.
> - The worker runs CPU2
> - CPU2 is back online
> - and now
>    CPU1                                         CPU2
>    new_count = ++fast_pool->count;
>     reg = fast_pool->count (FAST_POOL_MIX_INFLIGHT | 64)
>     incl reg (FAST_POOL_MIX_INFLIGHT | 65)
>                                                 WRITE_ONCE(fast_pool->count, 0);
>     fast_pool->count = reg ((FAST_POOL_MIX_INFLIGHT | 65)
>
> So we lost the WRITE_ONCE(, 0), FAST_POOL_MIX_INFLIGHT is still set and
> worker is not scheduled. Not easy to trigger, not by an ordinary user.
> Just wanted to mention…

Thanks for pointing this out. I'll actually fix this using atomics,
and fix another minor issue at the same time the same way, and move to
making sure the worker is running on the right CPU like we originally
discussed. I'm going to send that as an additional patch so that we
can narrow in on the issue there. It's a little bit involved but not
too bad. I'll have that for you shortly.

> crng_fast_load() does spin_trylock_irqsave() in hardirq context. It does
> not produce any warning on RT but is still wrong IMHO:
> If we just could move this, too.
> I don't know how timing critical this is but the first backtrace from
> crng_fast_load() came (to my surprise) from hwrng_fillfn() (a kthread)
> and added 64bytes in one go.

I'll look into seeing if I can do it. On my first pass a few days ago,
it seemed a bit too tricky, but I'll revisit after this part here
settles. Thanks for your benchmarks, by the way. That's useful.

Jason
Sebastian Andrzej Siewior Feb. 11, 2022, 7:09 a.m. UTC | #3
On 2022-02-11 01:42:56 [+0100], Jason A. Donenfeld wrote:
> Hi Sebastian,
Hi Jason,

> Thanks for pointing this out. I'll actually fix this using atomics,
> and fix another minor issue at the same time the same way, and move to
> making sure the worker is running on the right CPU like we originally
> discussed. I'm going to send that as an additional patch so that we
> can narrow in on the issue there. It's a little bit involved but not
> too bad. I'll have that for you shortly.

oki.

> > crng_fast_load() does spin_trylock_irqsave() in hardirq context. It does
> > not produce any warning on RT but is still wrong IMHO:
> > If we just could move this, too.
> > I don't know how timing critical this is but the first backtrace from
> > crng_fast_load() came (to my surprise) from hwrng_fillfn() (a kthread)
> > and added 64bytes in one go.
> 
> I'll look into seeing if I can do it. On my first pass a few days ago,
> it seemed a bit too tricky, but I'll revisit after this part here
> settles. Thanks for your benchmarks, by the way. That's useful.

If you want me to do it again or another machines, just let me know.
That was from a bigger x86 machine. I added that series and the patch at
the bottom to my RT queue.

-------->8--------

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Thu, 10 Feb 2022 18:22:05 +0100
Subject: [PATCH] random: Move crng_fast_load() to the worker.

crng_fast_load() is invoked from hard IRQ context and acquires a
spinlock_t via a trylock. If the lock is locked in hard IRQ context then
the following locking attempt (on another CPU) will PI-boost the wrong
task.

Move the crng_fast_load() invocation into the worker.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 drivers/char/random.c |   27 ++++++++++++++++++++++-----
 1 file changed, 22 insertions(+), 5 deletions(-)

--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1047,6 +1047,17 @@ static void mix_interrupt_randomness(str
 	struct fast_pool *fast_pool = container_of(work, struct fast_pool, mix);
 	u8 pool[sizeof(fast_pool->pool)];
 
+	if (unlikely(crng_init == 0)) {
+		size_t ret;
+
+		ret = crng_fast_load((u8 *)fast_pool->pool, sizeof(fast_pool->pool));
+		if (ret) {
+			WRITE_ONCE(fast_pool->count, 0);
+			fast_pool->last = jiffies;
+			return;
+		}
+	}
+
 	/*
 	 * Since this is the result of a trip through the scheduler, xor in
 	 * a cycle counter. It can't hurt, and might help.
@@ -1089,11 +1100,17 @@ void add_interrupt_randomness(int irq)
 	new_count = ++fast_pool->count;
 
 	if (unlikely(crng_init == 0)) {
-		if (new_count >= 64 &&
-		    crng_fast_load((u8 *)fast_pool->pool, sizeof(fast_pool->pool)) > 0) {
-			fast_pool->count = 0;
-			fast_pool->last = now;
-		}
+		if (new_count & FAST_POOL_MIX_INFLIGHT)
+			return;
+
+		if (new_count < 64)
+			return;
+
+		fast_pool->count |= FAST_POOL_MIX_INFLIGHT;
+		if (unlikely(!fast_pool->mix.func))
+			INIT_WORK(&fast_pool->mix, mix_interrupt_randomness);
+		queue_work_on(raw_smp_processor_id(), system_highpri_wq,
+			      &fast_pool->mix);
 		return;
 	}
 

> Jason

Sebastian
Dominik Brodowski Feb. 11, 2022, 8:25 a.m. UTC | #4
Am Thu, Feb 10, 2022 at 07:04:20PM +0100 schrieb Sebastian Andrzej Siewior:
> > @@ -999,9 +1016,10 @@ void add_interrupt_randomness(int irq)
> >  
> >  	fast_mix(fast_pool);
> >  	add_interrupt_bench(cycles);
> > +	new_count = ++fast_pool->count;
> >  
> >  	if (unlikely(crng_init == 0)) {
> > -		if ((fast_pool->count >= 64) &&
> > +		if (new_count >= 64 &&
> >  		    crng_fast_load((u8 *)fast_pool->pool, sizeof(fast_pool->pool)) > 0) {
> 
> crng_fast_load() does spin_trylock_irqsave() in hardirq context. It does
> not produce any warning on RT but is still wrong IMHO:
> - lockdep will see a random task and I remember in the past it produced
>   strange lock chains based on this.
> 
> - Should another task attempt to acquire this lock then it will PI-boost the
>   wrong task.
> 
> If we just could move this, too.
> 
> I don't know how timing critical this is but the first backtrace from
> crng_fast_load() came (to my surprise) from hwrng_fillfn() (a kthread)
> and added 64bytes in one go.

That's a hw rng (such as a tpm chip or the virtio-rng driver) providing some
entropy; if it's 64 bytes of input, crng_init progresses to 1, and
crng_fast_load() should never be called again.[*] I'm a bit suprised that the
hw_rng input occurred so early (it's only at device_initcall() level), and
earlier than 64 interrupts. But that may differ from system to system.

Note that crng_fast_load() will also never be called from
add_interrupt_randomness() if

	EFI, DT or kexec provides bootloader entropy of at least 64 bytes,
	and CONFIG_RANDOM_TRUST_BOOTLOADER is set

and/or	CONFIG_RANDOM_TRUST_CPU is set and the RDRAND/RDSEED instructions do
	not fail.

If neither of these three conditions (hw_rng is run early, bootloader or CPU
randomness) are met, the initial and early seeding of the base_crng depends
on add_interrupt_randomness(), and should happen rather quickly.

> I did move that crng_fast_load() into the worker and did made some
> numbers:
>            <idle>-0       [000] d..h1..     2.069924: add_interrupt_randomness: Tick
> 
> first interrupt
> …
>         swapper/0-1       [000] d..h.11     2.341938: add_interrupt_randomness: Tick
>         swapper/0-1       [000] d..h.11     2.341938: add_interrupt_randomness: work
> 
> the 64th interrupt, scheduling the worker.
> 
>         swapper/0-1       [000] d..h.11     2.345937: add_interrupt_randomness: Tick
>         swapper/0-1       [000] d..h111     2.349938: add_interrupt_randomness: Tick
>         swapper/0-1       [000] d..h.11     2.353939: add_interrupt_randomness: Tick
>         swapper/0-1       [000] d..h.11     2.357940: add_interrupt_randomness: Tick
>         swapper/0-1       [000] d..h111     2.361939: add_interrupt_randomness: Tick
>         swapper/0-1       [000] d..h111     2.365939: add_interrupt_randomness: Tick
>         swapper/0-1       [000] d..h.11     2.369941: add_interrupt_randomness: Tick
>      kworker/0:0H-6       [000] .......     2.384714: mix_interrupt_randomness: load
>      kworker/0:0H-6       [000] .......     2.384715: crng_fast_load: 16
>            <idle>-0       [001] dn.h1..     3.205766: add_interrupt_randomness: Tick
>            <idle>-0       [019] dn.h1..     6.771047: add_interrupt_randomness: Tick
> 
> 7 interrupts got lost before the worker could run & load first 16 bytes.
> The workqueue core gets initialized at that point and spawns first
> worker.

So the reason for the longer delay here is that the workqueue core had not
been initialized beforehand?

> After that the interrupts took a break.
> And then the work-to-load delay was quite low:
> 
>            <idle>-0       [019] dn.h1..     7.586234: add_interrupt_randomness: Tick
>            <idle>-0       [019] dn.h1..     7.586234: add_interrupt_randomness: work
>     kworker/19:0H-175     [019] .......     7.586504: mix_interrupt_randomness: load
>     kworker/19:0H-175     [019] .......     7.586507: crng_fast_load: 16
>            <idle>-0       [020] dn.h1..     7.614649: add_interrupt_randomness: Tick
>            <idle>-0       [020] dn.h1..     7.614651: add_interrupt_randomness: work
>            <idle>-0       [020] dn.h1..     7.614736: add_interrupt_randomness: Tick
>     kworker/20:0H-183     [020] dn.h...     7.614859: add_interrupt_randomness: Tick
>     kworker/20:0H-183     [020] .......     7.614871: mix_interrupt_randomness: load
>     kworker/20:0H-183     [020] .......     7.614872: crng_fast_load: 16
>            <idle>-0       [018] dn.h1..     8.352423: add_interrupt_randomness: Tick
>            <idle>-0       [018] dn.h1..     8.352423: add_interrupt_randomness: work
>     kworker/18:0H-167     [018] dn.h1..     8.352438: add_interrupt_randomness: Tick
>     kworker/18:0H-167     [018] dn.h1..     8.352448: add_interrupt_randomness: Tick
>     kworker/18:0H-167     [018] dn.h1..     8.352459: add_interrupt_randomness: Tick
>     kworker/18:0H-167     [018] dn.h1..     8.352491: add_interrupt_randomness: Tick
>     kworker/18:0H-167     [018] .......     8.352505: mix_interrupt_randomness: load
>     kworker/18:0H-167     [018] .......     8.352506: crng_fast_load: 16
> 
> In total we lost 13 ticks.

Was this still way before the initramfs was up and running?

> I did the same test on PREEMPT_VOLUNTARY and lost 2 ticks only.

Thanks,
	Dominik

[*] Actually, there's some contradiciton going on: If we do not trust the
hw_rng device (that is, its quality setting is 0), crng_fast_load() will be
called nonetheless, and the hw_rng-provided input will be used to increment
crng_init to 1. If !CONFIG_RANDOM_TRUST_BOOTLOADER, only crng_slow_load() is
called, and crng_init will remain at 0. Similar for
!CONFIG_RANDOM_TRUST_CPU.
Sebastian Andrzej Siewior Feb. 11, 2022, 2:18 p.m. UTC | #5
On 2022-02-11 09:25:00 [+0100], Dominik Brodowski wrote:
> That's a hw rng (such as a tpm chip or the virtio-rng driver) providing some
> entropy; if it's 64 bytes of input, crng_init progresses to 1, and
> crng_fast_load() should never be called again.[*] I'm a bit suprised that the
> hw_rng input occurred so early (it's only at device_initcall() level), and
> earlier than 64 interrupts. But that may differ from system to system.
> 
> Note that crng_fast_load() will also never be called from
> add_interrupt_randomness() if
> 
> 	EFI, DT or kexec provides bootloader entropy of at least 64 bytes,
> 	and CONFIG_RANDOM_TRUST_BOOTLOADER is set
> 
> and/or	CONFIG_RANDOM_TRUST_CPU is set and the RDRAND/RDSEED instructions do
> 	not fail.
> 
> If neither of these three conditions (hw_rng is run early, bootloader or CPU
> randomness) are met, the initial and early seeding of the base_crng depends
> on add_interrupt_randomness(), and should happen rather quickly.

Right.

> > I did move that crng_fast_load() into the worker and did made some
> > numbers:
> >            <idle>-0       [000] d..h1..     2.069924: add_interrupt_randomness: Tick
> > 
> > first interrupt
> > …
> >         swapper/0-1       [000] d..h.11     2.341938: add_interrupt_randomness: Tick
> >         swapper/0-1       [000] d..h.11     2.341938: add_interrupt_randomness: work
> > 
> > the 64th interrupt, scheduling the worker.
> > 
> >         swapper/0-1       [000] d..h.11     2.345937: add_interrupt_randomness: Tick
> >         swapper/0-1       [000] d..h111     2.349938: add_interrupt_randomness: Tick
> >         swapper/0-1       [000] d..h.11     2.353939: add_interrupt_randomness: Tick
> >         swapper/0-1       [000] d..h.11     2.357940: add_interrupt_randomness: Tick
> >         swapper/0-1       [000] d..h111     2.361939: add_interrupt_randomness: Tick
> >         swapper/0-1       [000] d..h111     2.365939: add_interrupt_randomness: Tick
> >         swapper/0-1       [000] d..h.11     2.369941: add_interrupt_randomness: Tick
> >      kworker/0:0H-6       [000] .......     2.384714: mix_interrupt_randomness: load
> >      kworker/0:0H-6       [000] .......     2.384715: crng_fast_load: 16
> >            <idle>-0       [001] dn.h1..     3.205766: add_interrupt_randomness: Tick
> >            <idle>-0       [019] dn.h1..     6.771047: add_interrupt_randomness: Tick
> > 
> > 7 interrupts got lost before the worker could run & load first 16 bytes.
> > The workqueue core gets initialized at that point and spawns first
> > worker.
> 
> So the reason for the longer delay here is that the workqueue core had not
> been initialized beforehand?

Kind of, yes. workqueue_init_early() happens quite early so we don't
have to worry about NULL pointer for system_highpri_wq. The worker
started after workqueue_init() populated the worker pools.

> > After that the interrupts took a break.
> > And then the work-to-load delay was quite low:
> > 
> >            <idle>-0       [019] dn.h1..     7.586234: add_interrupt_randomness: Tick
> >            <idle>-0       [019] dn.h1..     7.586234: add_interrupt_randomness: work
> >     kworker/19:0H-175     [019] .......     7.586504: mix_interrupt_randomness: load
> >     kworker/19:0H-175     [019] .......     7.586507: crng_fast_load: 16
> >            <idle>-0       [020] dn.h1..     7.614649: add_interrupt_randomness: Tick
> >            <idle>-0       [020] dn.h1..     7.614651: add_interrupt_randomness: work
> >            <idle>-0       [020] dn.h1..     7.614736: add_interrupt_randomness: Tick
> >     kworker/20:0H-183     [020] dn.h...     7.614859: add_interrupt_randomness: Tick
> >     kworker/20:0H-183     [020] .......     7.614871: mix_interrupt_randomness: load
> >     kworker/20:0H-183     [020] .......     7.614872: crng_fast_load: 16
> >            <idle>-0       [018] dn.h1..     8.352423: add_interrupt_randomness: Tick
> >            <idle>-0       [018] dn.h1..     8.352423: add_interrupt_randomness: work
> >     kworker/18:0H-167     [018] dn.h1..     8.352438: add_interrupt_randomness: Tick
> >     kworker/18:0H-167     [018] dn.h1..     8.352448: add_interrupt_randomness: Tick
> >     kworker/18:0H-167     [018] dn.h1..     8.352459: add_interrupt_randomness: Tick
> >     kworker/18:0H-167     [018] dn.h1..     8.352491: add_interrupt_randomness: Tick
> >     kworker/18:0H-167     [018] .......     8.352505: mix_interrupt_randomness: load
> >     kworker/18:0H-167     [018] .......     8.352506: crng_fast_load: 16
> > 
> > In total we lost 13 ticks.
> 
> Was this still way before the initramfs was up and running?

After unpacked initramfs. From current boot:

| [    5.901462] Unpacking initramfs...
| [    6.758747] sd 1:0:0:0: [sda] Attached SCSI disk
| [    7.886651] Freeing initrd memory: 9532K
| [    7.893753] Freeing unused kernel image (initmem) memory: 2184K
| [    7.963519] Write protecting the kernel read-only data: 20480k
| [    7.971465] Freeing unused kernel image (text/rodata gap) memory: 2032K
| [    7.979711] Freeing unused kernel image (rodata/data gap) memory: 1980K
| [    7.987132] Run /init as init process
| Loading, please wait...
| 
| Starting version 250.3-2
| [    8.157529] igb 0000:07:00.0 eno0: renamed from eth0
| [    8.203836] igb 0000:07:00.1 enp7s0f1: renamed from eth1
| [    8.219489] random: fast init done
| Begin: Loading essential drivers ... done.
| Begin: Running /scripts/init-premount ... done.
| Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
| Begin: Running /scripts/local-premount ... done.
| Warning: fsck not present, so skipping root file[    8.337554] XFS (sda2): Mounting V5 Filesystem
| [    8.392151] XFS (sda2): Ending clean mount
| done.
| Begin: Running /scripts/local-bottom ... done.
| Begin: Running /scripts/init-bottom ... done.
| [    8.540708] systemd[1]: systemd 250.3-2 running in system mode …
| [   12.207227] random: crng init done

I have to note that on this system there is a initramfs but all drivers
are built-in (you see sd, net _after_ "Unpacking initramfs" but there
are not drivers to load).

> Thanks,
> 	Dominik

Sebastian
diff mbox series

Patch

diff --git a/drivers/char/random.c b/drivers/char/random.c
index ceded1c4f73b..f985d84872de 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -377,12 +377,6 @@  static void _mix_pool_bytes(const void *in, int nbytes)
 	blake2s_update(&input_pool.hash, in, nbytes);
 }
 
-static void __mix_pool_bytes(const void *in, int nbytes)
-{
-	trace_mix_pool_bytes_nolock(nbytes, _RET_IP_);
-	_mix_pool_bytes(in, nbytes);
-}
-
 static void mix_pool_bytes(const void *in, int nbytes)
 {
 	unsigned long flags;
@@ -394,11 +388,13 @@  static void mix_pool_bytes(const void *in, int nbytes)
 }
 
 struct fast_pool {
-	u32 pool[4];
+	struct work_struct mix;
 	unsigned long last;
+	u32 pool[4];
+	unsigned int count;
 	u16 reg_idx;
-	u8 count;
 };
+#define FAST_POOL_MIX_INFLIGHT (1U << 31)
 
 /*
  * This is a fast mixing routine used by the interrupt randomness
@@ -428,7 +424,6 @@  static void fast_mix(struct fast_pool *f)
 
 	f->pool[0] = a;  f->pool[1] = b;
 	f->pool[2] = c;  f->pool[3] = d;
-	f->count++;
 }
 
 static void process_random_ready_list(void)
@@ -977,12 +972,34 @@  static u32 get_reg(struct fast_pool *f, struct pt_regs *regs)
 	return *ptr;
 }
 
+static void mix_interrupt_randomness(struct work_struct *work)
+{
+	struct fast_pool *fast_pool = container_of(work, struct fast_pool, mix);
+	u8 pool[sizeof(fast_pool->pool)];
+
+	/*
+	 * Since this is the result of a trip through the scheduler, xor in
+	 * a cycle counter. It can't hurt, and might help.
+	 */
+	fast_pool->pool[3] ^= random_get_entropy();
+	/* Copy the pool to the stack so that the mixer always has a consistent view. */
+	memcpy(pool, fast_pool->pool, sizeof(pool));
+	/* We take care to zero out the count only after we're done reading the pool. */
+	WRITE_ONCE(fast_pool->count, 0);
+	fast_pool->last = jiffies;
+
+	mix_pool_bytes(pool, sizeof(pool));
+	credit_entropy_bits(1);
+	memzero_explicit(pool, sizeof(pool));
+}
+
 void add_interrupt_randomness(int irq)
 {
 	struct fast_pool *fast_pool = this_cpu_ptr(&irq_randomness);
 	struct pt_regs *regs = get_irq_regs();
 	unsigned long now = jiffies;
 	cycles_t cycles = random_get_entropy();
+	unsigned int new_count;
 	u32 c_high, j_high;
 	u64 ip;
 
@@ -999,9 +1016,10 @@  void add_interrupt_randomness(int irq)
 
 	fast_mix(fast_pool);
 	add_interrupt_bench(cycles);
+	new_count = ++fast_pool->count;
 
 	if (unlikely(crng_init == 0)) {
-		if ((fast_pool->count >= 64) &&
+		if (new_count >= 64 &&
 		    crng_fast_load((u8 *)fast_pool->pool, sizeof(fast_pool->pool)) > 0) {
 			fast_pool->count = 0;
 			fast_pool->last = now;
@@ -1009,20 +1027,16 @@  void add_interrupt_randomness(int irq)
 		return;
 	}
 
-	if ((fast_pool->count < 64) && !time_after(now, fast_pool->last + HZ))
+	if (new_count & FAST_POOL_MIX_INFLIGHT)
 		return;
 
-	if (!spin_trylock(&input_pool.lock))
+	if (new_count < 64 && !time_after(now, fast_pool->last + HZ))
 		return;
 
-	fast_pool->last = now;
-	__mix_pool_bytes(&fast_pool->pool, sizeof(fast_pool->pool));
-	spin_unlock(&input_pool.lock);
-
-	fast_pool->count = 0;
-
-	/* award one bit for the contents of the fast pool */
-	credit_entropy_bits(1);
+	if (unlikely(!fast_pool->mix.func))
+		INIT_WORK(&fast_pool->mix, mix_interrupt_randomness);
+	fast_pool->count |= FAST_POOL_MIX_INFLIGHT;
+	queue_work_on(raw_smp_processor_id(), system_highpri_wq, &fast_pool->mix);
 }
 EXPORT_SYMBOL_GPL(add_interrupt_randomness);
 
diff --git a/include/trace/events/random.h b/include/trace/events/random.h
index ad149aeaf42c..833f42afc70f 100644
--- a/include/trace/events/random.h
+++ b/include/trace/events/random.h
@@ -52,12 +52,6 @@  DEFINE_EVENT(random__mix_pool_bytes, mix_pool_bytes,
 	TP_ARGS(bytes, IP)
 );
 
-DEFINE_EVENT(random__mix_pool_bytes, mix_pool_bytes_nolock,
-	TP_PROTO(int bytes, unsigned long IP),
-
-	TP_ARGS(bytes, IP)
-);
-
 TRACE_EVENT(credit_entropy_bits,
 	TP_PROTO(int bits, int entropy_count, unsigned long IP),