Message ID | 20220211162515.554867-1-Jason@zx2c4.com (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Herbert Xu |
Headers | show |
Series | [v6] random: defer fast pool mixing to worker | expand |
On 2022-02-11 17:25:15 [+0100], Jason A. Donenfeld wrote: > diff --git a/drivers/char/random.c b/drivers/char/random.c > index c42c07a7eb56..20b11a4b6559 100644 > --- a/drivers/char/random.c > +++ b/drivers/char/random.c > @@ -1214,12 +1215,59 @@ static u32 get_reg(struct fast_pool *f, struct pt_regs *regs) > return *ptr; > } > > +static void mix_interrupt_randomness(struct work_struct *work) > +{ > + struct fast_pool *fast_pool = container_of(work, struct fast_pool, mix); > + unsigned long pool[ARRAY_SIZE(fast_pool->pool)]; > + int count; > + > + /* Check to see if we're running on the wrong CPU due to hotplug. */ > + migrate_disable(); > + if (fast_pool != this_cpu_ptr(&irq_randomness)) { > + migrate_enable(); > + /* > + * If we are unlucky enough to have been moved to another CPU, + "during CPU hotplug while the CPU was shutdown". It should not look like the worker can be migrated on system without CPU-hotplug involved. > + * then we set our count to zero atomically so that when the > + * CPU comes back online, it can enqueue work again. The > + * _release here pairs with the atomic_inc_return_acquire in > + * add_interrupt_randomness(). > + */ > + atomic_set_release(&fast_pool->count, 0); > + return; > + } > + > + /* > + * Copy the pool to the stack so that the mixer always has a > + * consistent view. It's extremely unlikely but possible that > + * this 2 or 4 word read is interrupted by an irq, but in case > + * it is, we double check that count stays the same. > + * > + * We set the count to 0 so that irqs can immediately begin to > + * accumulate again after. Since any possible interruptions > + * at this stage are guaranteed to be on the same CPU, we can > + * use cmpxchg_relaxed. > + */ > + count = atomic_read(&fast_pool->count); > + do { > + memcpy(pool, fast_pool->pool, sizeof(pool)); > + } while (atomic_try_cmpxchg_relaxed(&fast_pool->count, &count, 0)); I *think* we could drop that "fast_pool != this_cpu_ptr(&irq_randomness)" check at the top since that cmpxchg will save us and redo the loop. But if I remember correctly you worried about fast_pool->pool being modified (which is only a corner case if we are on the other CPU while the orig CPU is back again). Either way, it would be random and we would not consume more entropy. So if we have to keep this then please swap that migrate_disable() with local_irq_disable(). Otherwise PeterZ will yell at me. > + fast_pool->last = jiffies; > + migrate_enable(); > + > + mix_pool_bytes(pool, sizeof(pool)); > + credit_entropy_bits(1); > + memzero_explicit(pool, sizeof(pool)); > +} > + … > @@ -1235,12 +1283,13 @@ void add_interrupt_randomness(int irq) > } > > fast_mix((u32 *)fast_pool->pool); > - ++fast_pool->count; > + /* The _acquire here pairs with the atomic_set_release in mix_interrupt_randomness(). */ > + new_count = (unsigned int)atomic_inc_return_acquire(&fast_pool->count); > > if (unlikely(crng_init == 0)) { > - if (fast_pool->count >= 64 && > + if (new_count >= 64 && > crng_fast_load(fast_pool->pool, sizeof(fast_pool->pool)) > 0) { > - fast_pool->count = 0; > + atomic_set(&fast_pool->count, 0); > fast_pool->last = now; I'm fine if we keep this as is for now. What do we do here vs RT? I suggested this https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/commit/?id=a2d2d54409481aa23a3e11ab9559a843e36a79ec Is this doable? Sebastian
Hi Sebastian, On Fri, Feb 11, 2022 at 5:44 PM Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote: > > + > > + /* Check to see if we're running on the wrong CPU due to hotplug. */ > > + migrate_disable(); > > + if (fast_pool != this_cpu_ptr(&irq_randomness)) { > > + migrate_enable(); > > + /* > > + * If we are unlucky enough to have been moved to another CPU, > > + "during CPU hotplug while the CPU was shutdown". It should not look > like the worker can be migrated on system without CPU-hotplug involved. Will adjust comment. > I *think* we could drop that "fast_pool != > this_cpu_ptr(&irq_randomness)" check at the top since that cmpxchg will > save us and redo the loop. But if I remember correctly you worried about > fast_pool->pool being modified (which is only a corner case if we are on > the other CPU while the orig CPU is back again). Either way, it would be > random and we would not consume more entropy. No, we cannot, and "it's all random anyway so who cares if we corrupt things!" is not rigorous, as entropy may actually be thrown away as it's moved between words on each mix. If we're not running on the same CPU, one CPU can corrupt the other's view of fast pool before updating count. We must keep this. > So if we have to keep this then please swap that migrate_disable() with > local_irq_disable(). Otherwise PeterZ will yell at me. Okay, I'll do that then, and then in the process get rid of the cmpxchg loop since it's no longer required. > > if (unlikely(crng_init == 0)) { > > - if (fast_pool->count >= 64 && > > + if (new_count >= 64 && > > crng_fast_load(fast_pool->pool, sizeof(fast_pool->pool)) > 0) { > > - fast_pool->count = 0; > > + atomic_set(&fast_pool->count, 0); > > fast_pool->last = now; > > I'm fine if we keep this as is for now. > What do we do here vs RT? I suggested this > https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/commit/?id=a2d2d54409481aa23a3e11ab9559a843e36a79ec > > Is this doable? It might be, but last time I checked it seemed problematic. As I mentioned in an earlier thread, I'll take a look again at that next week after this patch here settles. Haven't forgotten. v+1 coming up with irqs disabled. Jason
On 2022-02-11 17:50:34 [+0100], Jason A. Donenfeld wrote: > Hi Sebastian, Hi Jason, > > I *think* we could drop that "fast_pool != > > this_cpu_ptr(&irq_randomness)" check at the top since that cmpxchg will > > save us and redo the loop. But if I remember correctly you worried about > > fast_pool->pool being modified (which is only a corner case if we are on > > the other CPU while the orig CPU is back again). Either way, it would be > > random and we would not consume more entropy. > > No, we cannot, and "it's all random anyway so who cares if we corrupt > things!" is not rigorous, as entropy may actually be thrown away as > it's moved between words on each mix. If we're not running on the same > CPU, one CPU can corrupt the other's view of fast pool before updating > count. We must keep this. Okay, I assumed something like that. > > So if we have to keep this then please swap that migrate_disable() with > > local_irq_disable(). Otherwise PeterZ will yell at me. > > Okay, I'll do that then, and then in the process get rid of the > cmpxchg loop since it's no longer required. So the only reason why we have that atomic_t is for rare case where run on the remote CPU and need to remove the upper bit in the counter? > > > if (unlikely(crng_init == 0)) { > > > - if (fast_pool->count >= 64 && > > > + if (new_count >= 64 && > > > crng_fast_load(fast_pool->pool, sizeof(fast_pool->pool)) > 0) { > > > - fast_pool->count = 0; > > > + atomic_set(&fast_pool->count, 0); > > > fast_pool->last = now; > > > > I'm fine if we keep this as is for now. > > What do we do here vs RT? I suggested this > > https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/commit/?id=a2d2d54409481aa23a3e11ab9559a843e36a79ec > > > > Is this doable? > > It might be, but last time I checked it seemed problematic. As I > mentioned in an earlier thread, I'll take a look again at that next > week after this patch here settles. Haven't forgotten. Ah, cheers. > v+1 coming up with irqs disabled. > > Jason Sebastian
Hi Sebastian, On Fri, Feb 11, 2022 at 5:59 PM Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote: > > Okay, I'll do that then, and then in the process get rid of the > > cmpxchg loop since it's no longer required. > > So the only reason why we have that atomic_t is for rare case where run > on the remote CPU and need to remove the upper bit in the counter? Yes. That's the only remaining reason. Annoying, but whatareyagonnado? Jason
On 2022-02-11 18:00:21 [+0100], Jason A. Donenfeld wrote: > Hi Sebastian, Hi Jason, > On Fri, Feb 11, 2022 at 5:59 PM Sebastian Andrzej Siewior > <bigeasy@linutronix.de> wrote: > > > Okay, I'll do that then, and then in the process get rid of the > > > cmpxchg loop since it's no longer required. > > > > So the only reason why we have that atomic_t is for rare case where run > > on the remote CPU and need to remove the upper bit in the counter? > > Yes. That's the only remaining reason. Annoying, but whatareyagonnado? A CPU hotplug notifier which removes unconditionally that bit when the CPU goes down or sets it to 0. We can keep it as it. Just an idea for later maybe ;) > Jason Sebastian
On Fri, Feb 11, 2022 at 6:15 PM Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote: > A CPU hotplug notifier which removes unconditionally that bit when the > CPU goes down or sets it to 0. > We can keep it as it. Just an idea for later maybe ;) I looked into it and it seemed like the plumbing was kind of miserable for that. If you want to take a stab, though, that might be an okay followup patch, and then we can assess atomics vs notifier. I think notifier will wind up being a lot clunkier, though. Sounds like we should be all set for the v7 I sent out? Jason
On 2022-02-11 18:17:57 [+0100], Jason A. Donenfeld wrote: > On Fri, Feb 11, 2022 at 6:15 PM Sebastian Andrzej Siewior > <bigeasy@linutronix.de> wrote: > > A CPU hotplug notifier which removes unconditionally that bit when the > > CPU goes down or sets it to 0. > > We can keep it as it. Just an idea for later maybe ;) > > I looked into it and it seemed like the plumbing was kind of miserable > for that. If you want to take a stab, though, that might be an okay > followup patch, and then we can assess atomics vs notifier. I think > notifier will wind up being a lot clunkier, though. > > Sounds like we should be all set for the v7 I sent out? Sure. I can do the CPU-HP notifier later one once we are done with everything. I acked the v7, don't see a road block. > Jason Sebastian
On Fri, Feb 11, 2022 at 5:59 PM Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote: > > > What do we do here vs RT? I suggested this > > > https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/commit/?id=a2d2d54409481aa23a3e11ab9559a843e36a79ec > > > > > > Is this doable? > > > > It might be, but last time I checked it seemed problematic. As I > > mentioned in an earlier thread, I'll take a look again at that next > > week after this patch here settles. Haven't forgotten. > > Ah, cheers. I started looking at this and came up with this draft with questions: https://git.kernel.org/pub/scm/linux/kernel/git/crng/random.git/commit/?h=jd/no-irq-trylocks Some research remains... Jason
Hey Sebastian, On Fri, Feb 11, 2022 at 6:26 PM Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote: > Sure. I can do the CPU-HP notifier later one once we are done with > everything. I acked the v7, don't see a road block. I've been running this over the weekend and performance is generally okay. # perf top --symbols $(objdump -t /usr/src/linux/drivers/char/random.o|fgrep 'F .text'|awk '{print $6}'|tr '\n' ,) -F max -g One thing I noticed though was that add_interrupt_randomness spends most of its time on: lock xadd %eax,0x38(%rbp) So, as expected, those atomics really need to go. Indeed we might be best off with the CPU hotplug notifier for setting the count back to zero. Do you want to prepare a patch for this? Or should I take a stab at it? Jason
On 2022-02-13 18:37:33 [+0100], Jason A. Donenfeld wrote: > I started looking at this and came up with this draft with questions: > https://git.kernel.org/pub/scm/linux/kernel/git/crng/random.git/commit/?h=jd/no-irq-trylocks to | - Does anything anywhere call get_random_xx() before the worker has a | chance to run? Once you queue a work item I don't think that the scheduler needs to put it on the CPU right away. It may have already have other tasks waiting including some with a RT priority. Also, the lock is irqsave() so they can be users in an interrupt handler. I remember the original reason why I made it irqsave is because something did kmalloc() and SLUB somehow asked for random bits. > Some research remains... > > Jason Sebastian
On 2/14/22, Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote: > to > | - Does anything anywhere call get_random_xx() before the worker has a > | chance to run? > > Once you queue a work item I don't think that the scheduler needs to put > it on the CPU right away. It may have already have other tasks waiting > including some with a RT priority. > Also, the lock is irqsave() so they can be users in an interrupt > handler. I remember the original reason why I made it irqsave is because > something did kmalloc() and SLUB somehow asked for random bits. Right. So there are two sides of the questions: 1) how bad is this actual race, and are there any drivers that do regularly get bit by this? 2) There's a largeish window between workqueue_init_early() setting up the system highprio workqueue, and workqueue_init() enabling queued workers to actually run. Interrupts also get enabled in the interim. Does anything get bit by that window? Jason
On 2/13/22, Jason A. Donenfeld <Jason@zx2c4.com> wrote: > Hey Sebastian, > > On Fri, Feb 11, 2022 at 6:26 PM Sebastian Andrzej Siewior > <bigeasy@linutronix.de> wrote: >> Sure. I can do the CPU-HP notifier later one once we are done with >> everything. I acked the v7, don't see a road block. > > I've been running this over the weekend and performance is generally okay. > > # perf top --symbols $(objdump -t > /usr/src/linux/drivers/char/random.o|fgrep 'F .text'|awk '{print > $6}'|tr '\n' ,) -F max -g > > One thing I noticed though was that add_interrupt_randomness spends > most of its time on: > > lock xadd %eax,0x38(%rbp) > > So, as expected, those atomics really need to go. Indeed we might be > best off with the CPU hotplug notifier for setting the count back to > zero. > > Do you want to prepare a patch for this? Or should I take a stab at it? FYI, I was overwhelmed with a compulsion to try doing it myself and posted that here https://lore.kernel.org/lkml/20220213215343.11652-1-Jason@zx2c4.com/ which is now pending your review.
On 2022-02-14 11:17:20 [+0100], Jason A. Donenfeld wrote: > On 2/14/22, Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote: > > to > > | - Does anything anywhere call get_random_xx() before the worker has a > > | chance to run? > > > > Once you queue a work item I don't think that the scheduler needs to put > > it on the CPU right away. It may have already have other tasks waiting > > including some with a RT priority. > > Also, the lock is irqsave() so they can be users in an interrupt > > handler. I remember the original reason why I made it irqsave is because > > something did kmalloc() and SLUB somehow asked for random bits. > > Right. So there are two sides of the questions: 1) how bad is this > actual race, and are there any drivers that do regularly get bit by > this? 2) There's a largeish window between workqueue_init_early() > setting up the system highprio workqueue, and workqueue_init() > enabling queued workers to actually run. Interrupts also get enabled > in the interim. Does anything get bit by that window? This is only important during boot-up, right? Otherwise it just extracts entropy from the pool. I posted numbers earlier on where the work go scheduled and the three or four interrupts came in before the work-item was scheduled. I could send you the diff if you want to up it on some machines. > Jason Sebastian
On Mon, Feb 14, 2022 at 12:16 PM Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote: > > On 2022-02-14 11:17:20 [+0100], Jason A. Donenfeld wrote: > > On 2/14/22, Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote: > > > to > > > | - Does anything anywhere call get_random_xx() before the worker has a > > > | chance to run? > > > > > > Once you queue a work item I don't think that the scheduler needs to put > > > it on the CPU right away. It may have already have other tasks waiting > > > including some with a RT priority. > > > Also, the lock is irqsave() so they can be users in an interrupt > > > handler. I remember the original reason why I made it irqsave is because > > > something did kmalloc() and SLUB somehow asked for random bits. > > > > Right. So there are two sides of the questions: 1) how bad is this > > actual race, and are there any drivers that do regularly get bit by > > this? 2) There's a largeish window between workqueue_init_early() > > setting up the system highprio workqueue, and workqueue_init() > > enabling queued workers to actually run. Interrupts also get enabled > > in the interim. Does anything get bit by that window? > > This is only important during boot-up, right? Right. This is a pre-init window only. But a bunch of things are done pre-init -- siphash secret keys, aslr seeds, and so forth.
diff --git a/drivers/char/random.c b/drivers/char/random.c index c42c07a7eb56..20b11a4b6559 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -1162,9 +1162,10 @@ EXPORT_SYMBOL_GPL(add_bootloader_randomness); struct fast_pool { unsigned long pool[16 / sizeof(long)]; + struct work_struct mix; unsigned long last; + atomic_t count; u16 reg_idx; - u8 count; }; /* @@ -1214,12 +1215,59 @@ static u32 get_reg(struct fast_pool *f, struct pt_regs *regs) return *ptr; } +static void mix_interrupt_randomness(struct work_struct *work) +{ + struct fast_pool *fast_pool = container_of(work, struct fast_pool, mix); + unsigned long pool[ARRAY_SIZE(fast_pool->pool)]; + int count; + + /* Check to see if we're running on the wrong CPU due to hotplug. */ + migrate_disable(); + if (fast_pool != this_cpu_ptr(&irq_randomness)) { + migrate_enable(); + /* + * If we are unlucky enough to have been moved to another CPU, + * then we set our count to zero atomically so that when the + * CPU comes back online, it can enqueue work again. The + * _release here pairs with the atomic_inc_return_acquire in + * add_interrupt_randomness(). + */ + atomic_set_release(&fast_pool->count, 0); + return; + } + + /* + * Copy the pool to the stack so that the mixer always has a + * consistent view. It's extremely unlikely but possible that + * this 2 or 4 word read is interrupted by an irq, but in case + * it is, we double check that count stays the same. + * + * We set the count to 0 so that irqs can immediately begin to + * accumulate again after. Since any possible interruptions + * at this stage are guaranteed to be on the same CPU, we can + * use cmpxchg_relaxed. + */ + count = atomic_read(&fast_pool->count); + do { + memcpy(pool, fast_pool->pool, sizeof(pool)); + } while (atomic_try_cmpxchg_relaxed(&fast_pool->count, &count, 0)); + + fast_pool->last = jiffies; + migrate_enable(); + + mix_pool_bytes(pool, sizeof(pool)); + credit_entropy_bits(1); + memzero_explicit(pool, sizeof(pool)); +} + void add_interrupt_randomness(int irq) { + enum { MIX_INFLIGHT = 1U << 31 }; struct fast_pool *fast_pool = this_cpu_ptr(&irq_randomness); struct pt_regs *regs = get_irq_regs(); unsigned long now = jiffies; cycles_t cycles = random_get_entropy(); + unsigned int new_count; if (cycles == 0) cycles = get_reg(fast_pool, regs); @@ -1235,12 +1283,13 @@ void add_interrupt_randomness(int irq) } fast_mix((u32 *)fast_pool->pool); - ++fast_pool->count; + /* The _acquire here pairs with the atomic_set_release in mix_interrupt_randomness(). */ + new_count = (unsigned int)atomic_inc_return_acquire(&fast_pool->count); if (unlikely(crng_init == 0)) { - if (fast_pool->count >= 64 && + if (new_count >= 64 && crng_fast_load(fast_pool->pool, sizeof(fast_pool->pool)) > 0) { - fast_pool->count = 0; + atomic_set(&fast_pool->count, 0); fast_pool->last = now; /* @@ -1254,20 +1303,16 @@ void add_interrupt_randomness(int irq) return; } - if ((fast_pool->count < 64) && !time_after(now, fast_pool->last + HZ)) + if (new_count & MIX_INFLIGHT) return; - if (!spin_trylock(&input_pool.lock)) + if (new_count < 64 && !time_after(now, fast_pool->last + HZ)) return; - fast_pool->last = now; - _mix_pool_bytes(&fast_pool->pool, sizeof(fast_pool->pool)); - spin_unlock(&input_pool.lock); - - fast_pool->count = 0; - - /* Award one bit for the contents of the fast pool. */ - credit_entropy_bits(1); + if (unlikely(!fast_pool->mix.func)) + INIT_WORK(&fast_pool->mix, mix_interrupt_randomness); + atomic_or(MIX_INFLIGHT, &fast_pool->count); + queue_work_on(raw_smp_processor_id(), system_highpri_wq, &fast_pool->mix); } EXPORT_SYMBOL_GPL(add_interrupt_randomness);