Message ID | 20220705112712.4433-2-dengler@linux.ibm.com (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Herbert Xu |
Headers | show |
Series | s390/archrandom: use buffered random data | expand |
Hi Holger, On Tue, Jul 05, 2022 at 01:27:12PM +0200, Holger Dengler wrote: > The trng instruction is very expensive and has a constant runtime for > getting 0 to 32 bytes of (conditioned) true random data. Calling trng for > in arch_get_random_seed_long() for each 8 bytes is too time-consuming, > especially if it is called in interrupt context. > > This implementation buffers the trng data and deliver parts of it to the This patch seems to be repeating the same mistake I just cleaned up. Specifically, an advantage of a CPU RNG is that it can always provide *fresh* entropy, so that if, say, the system state is dumped, the CPU will continue to provide fresh new uncompromised values. When you buffer those values, they cease to be fresh. But more realistically, have you benchmarked this and seen that it's actually required? These functions are called once at boot, and then when the RNG is reseeded, which happens around once a minute. That's pretty darn rare. When you consider all the cycles that are completed over the course of a minute, whatever the cost of the TRNG is seems pretty negligible. So anyway, maybe it'd be best to look at the "big picture" problem you want to solve, rather than what looks to me like an attempt to solve a problem that doesn't exist. Or maybe it does? If so, I'd be interested to know when and how and where and such. Jason
Hey again, On Tue, Jul 5, 2022 at 3:18 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > On Tue, Jul 05, 2022 at 01:27:12PM +0200, Holger Dengler wrote: > > The trng instruction is very expensive and has a constant runtime for > > getting 0 to 32 bytes of (conditioned) true random data. Calling trng for Thinking about this a bit more, though, makes me think that maybe the API itself is suboptimal. First, we never use arch_get_random_int() anywhere in the kernel, so in theory it could be removed. Next, we could change the signature of it to be something like: static inline size_t arch_get_random_words_seed(unsigned long *buf, size_t num_words); static inline size_t arch_get_random_words(unsigned long *buf, size_t num_words); Since it's a static inline, when words==1, the code would be constant folded to the same thing it is now on x86. And when it's larger, we'd get more bang for our buck on s390. An unfortunate complication is that the uses in random.c currently follow the pattern of: if (!arch_get_random_seed_long(&v) || !arch_get_random_long(&v)) v = random_get_entropy(); And in that way things cascade down nicely per word, depending on current bus activity. The cascade would get a little bit uglier with what I suggested above. So we'd need to figure out something there to make that not hideous or result in awful codegen or something. This is, no doubt, a snag. Anyway, If you want to work on a tree-wide cleanup of this, I'd be happy to consider something like that in the random.git tree for 5.20. But, alternatively, maybe you think none of the above is really worth it just to get more per call from the TRNG, and so emphatic "meh" on what I've described. I can see this perspective well too. Jason
Hi Jason, On 05/07/2022 15:18, Jason A. Donenfeld wrote: > Hi Holger, > > On Tue, Jul 05, 2022 at 01:27:12PM +0200, Holger Dengler wrote: >> The trng instruction is very expensive and has a constant runtime for >> getting 0 to 32 bytes of (conditioned) true random data. Calling trng for >> in arch_get_random_seed_long() for each 8 bytes is too time-consuming, >> especially if it is called in interrupt context. >> >> This implementation buffers the trng data and deliver parts of it to the > > This patch seems to be repeating the same mistake I just cleaned up. > Specifically, an advantage of a CPU RNG is that it can always provide > *fresh* entropy, so that if, say, the system state is dumped, the CPU > will continue to provide fresh new uncompromised values. When you buffer > those values, they cease to be fresh. You're right, the buffering has the disadvantage, that the random data for the non-first callers are not fresh. But if we only want to have fresh data here, we should avoid this call in interrupt context completely (see below). > But more realistically, have you benchmarked this and seen that it's > actually required? These functions are called once at boot, and then > when the RNG is reseeded, which happens around once a minute. That's > pretty darn rare. When you consider all the cycles that are completed > over the course of a minute, whatever the cost of the TRNG is seems > pretty negligible. It is true, that the performance of the instruction is not really relevant, but only for calls outside of an interrupt context. I did some ftrace logging for the s390_random_get_seed_long() calls, and - as you said - there are a few calls per minute. But there was also some repeating calls in interrupt context. On systems with a huge interrupt load, this can cause severe performance impacts. I've no concrete performance measurements at the moment, but - as I said - the trng instruction on s390 takes a lot of time and in interrupt context the interrupts are disabled for the complete runtime of the instruction. > So anyway, maybe it'd be best to look at the "big picture" problem you > want to solve, rather than what looks to me like an attempt to solve a > problem that doesn't exist. Or maybe it does? If so, I'd be interested > to know when and how and where and such. The optimization of the trng calls was not the main goal, but we (Harald and I) thought about how can we provide trng data to in-interrupt callers as well, without doing the trng instruction call in the interrupt context itself. At the moment, I don't see any possibility to do both, fresh data and prevent trng calls in interrupt context BUT provide trng data for in-interrupt-callers. But I'm always open for new ideas. If the data must be fresh, I would propose not to use any trng-generated data for in-interrupt callers. > > Jason
Hi Holger, On Tue, Jul 05, 2022 at 04:58:30PM +0200, Holger Dengler wrote: > It is true, that the performance of the instruction is not really > relevant, but only for calls outside of an interrupt context. I did > some ftrace logging for the s390_random_get_seed_long() calls, and - > as you said - there are a few calls per minute. But there was also > some repeating calls in interrupt context. On systems with a huge > interrupt load, this can cause severe performance impacts. I've no It'd be interesting to know more about this. The way you get arch_random_get_seed_long() from irq context is: get_random_{bytes,int,long,u32,u64}() crng_make_state() crng_reseed() <-- Rarely extract_entropy() arch_get_random_seed_long() So if an irq user of get_random_xx() is the unlucky one in the minute span who has to call crng_reseed() then, yea, that'll happen. But I wonder about this luck aspect. What scenarios are you seeing where this happens all the time? Which driver is using random bytes *so* commonly from irq context? Not that, per say, there's anything wrong with that, but it could be eyebrow raising, and might point to de facto solutions that mostly take care of this. One such direction might be making a driver that does such a thing do it a little bit less, somehow. Another direction would be preferring non-irqs to handle crng_reseed(), but not disallowing irqs entirely, with a patch something like the one below. Or maybe there are other ideas. But all this is to say that having some more of the "mundane" details about this might actually help us. Jason diff --git a/drivers/char/random.c b/drivers/char/random.c index e3dd1dd3dd22..81df8cdf2a62 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -270,6 +270,9 @@ static bool crng_has_old_seed(void) static bool early_boot = true; unsigned long interval = CRNG_RESEED_INTERVAL; + if (in_hardirq()) + interval += HZ * 10; + if (unlikely(READ_ONCE(early_boot))) { time64_t uptime = ktime_get_seconds(); if (uptime >= CRNG_RESEED_INTERVAL / HZ * 2)
Hi Jason, On 05/07/2022 17:11, Jason A. Donenfeld wrote: > Hi Holger, > > On Tue, Jul 05, 2022 at 04:58:30PM +0200, Holger Dengler wrote: >> It is true, that the performance of the instruction is not really >> relevant, but only for calls outside of an interrupt context. I did >> some ftrace logging for the s390_random_get_seed_long() calls, and - >> as you said - there are a few calls per minute. But there was also >> some repeating calls in interrupt context. On systems with a huge >> interrupt load, this can cause severe performance impacts. I've no > > It'd be interesting to know more about this. The way you get > arch_random_get_seed_long() from irq context is: > > get_random_{bytes,int,long,u32,u64}() > crng_make_state() > crng_reseed() <-- Rarely > extract_entropy() > arch_get_random_seed_long() > > So if an irq user of get_random_xx() is the unlucky one in the minute > span who has to call crng_reseed() then, yea, that'll happen. But I > wonder about this luck aspect. What scenarios are you seeing where this > happens all the time? Which driver is using random bytes *so* commonly > from irq context? Not that, per say, there's anything wrong with that, > but it could be eyebrow raising, and might point to de facto solutions > that mostly take care of this. I saw a few calls in interrupt context during my tracing, but I didn't look to see which ones they were. Let me figure that out in the next few days and provide more information on that. > One such direction might be making a driver that does such a thing do it > a little bit less, somehow. Another direction would be preferring > non-irqs to handle crng_reseed(), but not disallowing irqs entirely, > with a patch something like the one below. Or maybe there are other > ideas. Reduce the number of trng in interrupt context is a possibility, but - in my opinion - only one single trng instruction call in interrupt context in one too much. For the moment, I would propose to drop the buffering but also return false, if arch_random_get_seed_long() is called in interrupt context. diff --git a/arch/s390/include/asm/archrandom.h b/arch/s390/include/asm/archrandom.h index 2c6e1c6ecbe7..711357bdc464 100644 --- a/arch/s390/include/asm/archrandom.h +++ b/arch/s390/include/asm/archrandom.h @@ -32,7 +32,8 @@ static inline bool __must_check arch_get_random_int(unsigned int *v) static inline bool __must_check arch_get_random_seed_long(unsigned long *v) { - if (static_branch_likely(&s390_arch_random_available)) { + if (static_branch_likely(&s390_arch_random_available) && + !in_interrupt()) { cpacf_trng(NULL, 0, (u8 *)v, sizeof(*v)); atomic64_add(sizeof(*v), &s390_arch_random_counter); return true; (on-top of your commit, without our buffering patch) > > But all this is to say that having some more of the "mundane" details > about this might actually help us. > > Jason > > diff --git a/drivers/char/random.c b/drivers/char/random.c > index e3dd1dd3dd22..81df8cdf2a62 100644 > --- a/drivers/char/random.c > +++ b/drivers/char/random.c > @@ -270,6 +270,9 @@ static bool crng_has_old_seed(void) > static bool early_boot = true; > unsigned long interval = CRNG_RESEED_INTERVAL; > > + if (in_hardirq()) > + interval += HZ * 10; > + > if (unlikely(READ_ONCE(early_boot))) { > time64_t uptime = ktime_get_seconds(); > if (uptime >= CRNG_RESEED_INTERVAL / HZ * 2) >
Hi Holger, On Tue, Jul 05, 2022 at 06:27:59PM +0200, Holger Dengler wrote: > I saw a few calls in interrupt context during my tracing, but I didn't > look to see which ones they were. Let me figure that out in the next > few days and provide more information on that. One thing to keep in mind is that it's used at boot time, when technically IRQs are turned off, so it appears like interrupt context depending on which way you squint. But boot time obviously isn't a problem. So be sure that's not the usage you're seeing. > For the moment, I would propose to drop the buffering but also return > false, if arch_random_get_seed_long() is called in interrupt context. As a last ditch, maybe that's best. Maybe... Do you know off hand how many cycles each call takes? > diff --git a/arch/s390/include/asm/archrandom.h b/arch/s390/include/asm/archrandom.h > index 2c6e1c6ecbe7..711357bdc464 100644 > --- a/arch/s390/include/asm/archrandom.h > +++ b/arch/s390/include/asm/archrandom.h > @@ -32,7 +32,8 @@ static inline bool __must_check arch_get_random_int(unsigned int *v) > > static inline bool __must_check arch_get_random_seed_long(unsigned long *v) > { > - if (static_branch_likely(&s390_arch_random_available)) { > + if (static_branch_likely(&s390_arch_random_available) && > + !in_interrupt()) { in_interrupt() is deprecated. You want in_hardirq() here. You'll also want to verify that this doesn't prevent random_init() from working. Jason
Hi Jason, On 05/07/2022 18:35, Jason A. Donenfeld wrote: > Hi Holger, > > On Tue, Jul 05, 2022 at 06:27:59PM +0200, Holger Dengler wrote: >> I saw a few calls in interrupt context during my tracing, but I didn't >> look to see which ones they were. Let me figure that out in the next >> few days and provide more information on that. > > One thing to keep in mind is that it's used at boot time, when > technically IRQs are turned off, so it appears like interrupt context > depending on which way you squint. But boot time obviously isn't a > problem. So be sure that's not the usage you're seeing. Ok, let me check this. I will also think about the tree-wide cleanup you mentioned in an earlier mail. It looks, that s390 could fill the block.rdseed with a single call. >> For the moment, I would propose to drop the buffering but also return >> false, if arch_random_get_seed_long() is called in interrupt context. > > As a last ditch, maybe that's best. Maybe... Do you know off hand how > many cycles each call takes? I don't know the exact number of cycles, but as I mentioned in the coverletter, the trng instruction is one of the specialties of the s390 platform. It looks like an instruction, but it is some kind of firmware executed (it is called millicode). These kind of long-running instructions are also interruptable and can resume. A trng call runs for minimal ~20-190us for 32 bytes. 20us on newer machine generations, 190us on older ones. These are not 100% exact measurements, but the dimension should be correct. > >> diff --git a/arch/s390/include/asm/archrandom.h b/arch/s390/include/asm/archrandom.h >> index 2c6e1c6ecbe7..711357bdc464 100644 >> --- a/arch/s390/include/asm/archrandom.h >> +++ b/arch/s390/include/asm/archrandom.h >> @@ -32,7 +32,8 @@ static inline bool __must_check arch_get_random_int(unsigned int *v) >> >> static inline bool __must_check arch_get_random_seed_long(unsigned long *v) >> { >> - if (static_branch_likely(&s390_arch_random_available)) { >> + if (static_branch_likely(&s390_arch_random_available) && >> + !in_interrupt()) { > > in_interrupt() is deprecated. You want in_hardirq() here. You'll also > want to verify that this doesn't prevent random_init() from working. > > Jason
Hey Holger, On Tue, Jul 05, 2022 at 07:47:37PM +0200, Holger Dengler wrote: > A trng call runs for minimal ~20-190us for 32 bytes. 20us on newer > machine generations, 190us on older ones. These are not 100% exact > measurements, but the dimension should be correct. Holy smokes. Yea, okay, I see what you're saying. So indeed it sounds like the `!in_hardirq()` addition would be a good idea. Let's do that. Also, I noticed that the TRNG has a hwrng driver. That means the RNG will still be getting continuous input from it in a kthread, not an interrupt handler, so from a crypto PoV, we're not really losing /that/ much by adding the `!in_hardirq()` clause. So all and all, that seems like the simplest solution without too big of a downside. Jason
Hi Jason, On 05/07/2022 20:19, Jason A. Donenfeld wrote: > Hey Holger, > > On Tue, Jul 05, 2022 at 07:47:37PM +0200, Holger Dengler wrote: >> A trng call runs for minimal ~20-190us for 32 bytes. 20us on newer >> machine generations, 190us on older ones. These are not 100% exact >> measurements, but the dimension should be correct. > > Holy smokes. Yea, okay, I see what you're saying. So indeed it sounds > like the `!in_hardirq()` addition would be a good idea. Let's do that. :) I'll come up with this in v2. For the long run, a re-worked API arch_get_random_seed_something() with an arch-dependant variable block length is worth to think about. It seems, that x86 and ppc delivers a long per trng instruction call, while on s390 it would make more sense to fill the block.rdseed in a single call.
On 2022-07-05 18:27, Holger Dengler wrote: > Hi Jason, > > On 05/07/2022 17:11, Jason A. Donenfeld wrote: >> Hi Holger, >> >> On Tue, Jul 05, 2022 at 04:58:30PM +0200, Holger Dengler wrote: >>> It is true, that the performance of the instruction is not really >>> relevant, but only for calls outside of an interrupt context. I did >>> some ftrace logging for the s390_random_get_seed_long() calls, and - >>> as you said - there are a few calls per minute. But there was also >>> some repeating calls in interrupt context. On systems with a huge >>> interrupt load, this can cause severe performance impacts. I've no >> >> It'd be interesting to know more about this. The way you get >> arch_random_get_seed_long() from irq context is: >> >> get_random_{bytes,int,long,u32,u64}() >> crng_make_state() >> crng_reseed() <-- Rarely >> extract_entropy() >> arch_get_random_seed_long() >> >> So if an irq user of get_random_xx() is the unlucky one in the minute >> span who has to call crng_reseed() then, yea, that'll happen. But I >> wonder about this luck aspect. What scenarios are you seeing where >> this >> happens all the time? Which driver is using random bytes *so* commonly >> from irq context? Not that, per say, there's anything wrong with that, >> but it could be eyebrow raising, and might point to de facto solutions >> that mostly take care of this. > > I saw a few calls in interrupt context during my tracing, but I didn't > look to see which ones they were. Let me figure that out in the next > few days and provide more information on that. > >> One such direction might be making a driver that does such a thing do >> it >> a little bit less, somehow. Another direction would be preferring >> non-irqs to handle crng_reseed(), but not disallowing irqs entirely, >> with a patch something like the one below. Or maybe there are other >> ideas. > > Reduce the number of trng in interrupt context is a possibility, but - > in my opinion - only one single trng instruction call in interrupt > context in one too much. > > For the moment, I would propose to drop the buffering but also return > false, if arch_random_get_seed_long() is called in interrupt context. > > diff --git a/arch/s390/include/asm/archrandom.h > b/arch/s390/include/asm/archrandom.h > index 2c6e1c6ecbe7..711357bdc464 100644 > --- a/arch/s390/include/asm/archrandom.h > +++ b/arch/s390/include/asm/archrandom.h > @@ -32,7 +32,8 @@ static inline bool __must_check > arch_get_random_int(unsigned int *v) > > static inline bool __must_check arch_get_random_seed_long(unsigned > long *v) > { > - if (static_branch_likely(&s390_arch_random_available)) { > + if (static_branch_likely(&s390_arch_random_available) && > + !in_interrupt()) { > cpacf_trng(NULL, 0, (u8 *)v, sizeof(*v)); > atomic64_add(sizeof(*v), &s390_arch_random_counter); > return true; > > (on-top of your commit, without our buffering patch) > >> >> But all this is to say that having some more of the "mundane" details >> about this might actually help us. >> >> Jason >> >> diff --git a/drivers/char/random.c b/drivers/char/random.c >> index e3dd1dd3dd22..81df8cdf2a62 100644 >> --- a/drivers/char/random.c >> +++ b/drivers/char/random.c >> @@ -270,6 +270,9 @@ static bool crng_has_old_seed(void) >> static bool early_boot = true; >> unsigned long interval = CRNG_RESEED_INTERVAL; >> >> + if (in_hardirq()) >> + interval += HZ * 10; >> + >> if (unlikely(READ_ONCE(early_boot))) { >> time64_t uptime = ktime_get_seconds(); >> if (uptime >= CRNG_RESEED_INTERVAL / HZ * 2) >> Hi Holger and Jason I tried to find out what is the reason of the invocations in interrupt context. First I have to admit that there is in fact not much of arch_get_random_seed_long() invocation any more in the recent kernel (5.19-rc5). I see about 100 invocations within 10 minutes with an LPAR running some qperf and dd dumps on dasds test load. About half of these invocations is in interrupt context. I dump_stack()ed some of these and I always catch the function kfence_guarded_alloc() prandom_u32_max() prandom_u32() get_random_u32() _get_random_bytes() crng_make_state() crng_reseed() extract_entropy() arch_get_random_seed_long() However, with so few invocations it should not make any harm when there is a even very expensive trng() invocation in interrupt context. But I think we should check, if this is really something to backport to the older kernels where arch_get_random_seed_long() is called really frequency. Harald Freudenberger
Hi Harald, On Wed, Jul 06, 2022 at 06:18:27PM +0200, Harald Freudenberger wrote: > On 2022-07-05 18:27, Holger Dengler wrote: > > Hi Jason, > > > > On 05/07/2022 17:11, Jason A. Donenfeld wrote: > >> Hi Holger, > >> > >> On Tue, Jul 05, 2022 at 04:58:30PM +0200, Holger Dengler wrote: > >>> It is true, that the performance of the instruction is not really > >>> relevant, but only for calls outside of an interrupt context. I did > >>> some ftrace logging for the s390_random_get_seed_long() calls, and - > >>> as you said - there are a few calls per minute. But there was also > >>> some repeating calls in interrupt context. On systems with a huge > >>> interrupt load, this can cause severe performance impacts. I've no > >> > >> It'd be interesting to know more about this. The way you get > >> arch_random_get_seed_long() from irq context is: > >> > >> get_random_{bytes,int,long,u32,u64}() > >> crng_make_state() > >> crng_reseed() <-- Rarely > >> extract_entropy() > >> arch_get_random_seed_long() > >> > >> So if an irq user of get_random_xx() is the unlucky one in the minute > >> span who has to call crng_reseed() then, yea, that'll happen. But I > >> wonder about this luck aspect. What scenarios are you seeing where > >> this > >> happens all the time? Which driver is using random bytes *so* commonly > >> from irq context? Not that, per say, there's anything wrong with that, > >> but it could be eyebrow raising, and might point to de facto solutions > >> that mostly take care of this. > > > > I saw a few calls in interrupt context during my tracing, but I didn't > > look to see which ones they were. Let me figure that out in the next > > few days and provide more information on that. > > > >> One such direction might be making a driver that does such a thing do > >> it > >> a little bit less, somehow. Another direction would be preferring > >> non-irqs to handle crng_reseed(), but not disallowing irqs entirely, > >> with a patch something like the one below. Or maybe there are other > >> ideas. > > > > Reduce the number of trng in interrupt context is a possibility, but - > > in my opinion - only one single trng instruction call in interrupt > > context in one too much. > > > > For the moment, I would propose to drop the buffering but also return > > false, if arch_random_get_seed_long() is called in interrupt context. > > > > diff --git a/arch/s390/include/asm/archrandom.h > > b/arch/s390/include/asm/archrandom.h > > index 2c6e1c6ecbe7..711357bdc464 100644 > > --- a/arch/s390/include/asm/archrandom.h > > +++ b/arch/s390/include/asm/archrandom.h > > @@ -32,7 +32,8 @@ static inline bool __must_check > > arch_get_random_int(unsigned int *v) > > > > static inline bool __must_check arch_get_random_seed_long(unsigned > > long *v) > > { > > - if (static_branch_likely(&s390_arch_random_available)) { > > + if (static_branch_likely(&s390_arch_random_available) && > > + !in_interrupt()) { > > cpacf_trng(NULL, 0, (u8 *)v, sizeof(*v)); > > atomic64_add(sizeof(*v), &s390_arch_random_counter); > > return true; > > > > (on-top of your commit, without our buffering patch) > > > >> > >> But all this is to say that having some more of the "mundane" details > >> about this might actually help us. > >> > >> Jason > >> > >> diff --git a/drivers/char/random.c b/drivers/char/random.c > >> index e3dd1dd3dd22..81df8cdf2a62 100644 > >> --- a/drivers/char/random.c > >> +++ b/drivers/char/random.c > >> @@ -270,6 +270,9 @@ static bool crng_has_old_seed(void) > >> static bool early_boot = true; > >> unsigned long interval = CRNG_RESEED_INTERVAL; > >> > >> + if (in_hardirq()) > >> + interval += HZ * 10; > >> + > >> if (unlikely(READ_ONCE(early_boot))) { > >> time64_t uptime = ktime_get_seconds(); > >> if (uptime >= CRNG_RESEED_INTERVAL / HZ * 2) > >> > > Hi Holger and Jason > I tried to find out what is the reason of the invocations in interrupt > context. > First I have to admit that there is in fact not much of > arch_get_random_seed_long() > invocation any more in the recent kernel (5.19-rc5). I see about 100 > invocations > within 10 minutes with an LPAR running some qperf and dd dumps on dasds > test load. > About half of these invocations is in interrupt context. I > dump_stack()ed some of > these and I always catch the function > kfence_guarded_alloc() > prandom_u32_max() > prandom_u32() > get_random_u32() > _get_random_bytes() > crng_make_state() > crng_reseed() > extract_entropy() > arch_get_random_seed_long() > > However, with so few invocations it should not make any harm when there > is a > even very expensive trng() invocation in interrupt context. > > But I think we should check, if this is really something to backport to > the older > kernels where arch_get_random_seed_long() is called really frequency. I backported the current random.c design to old kernels, so the situation there should be the same as in 5.19-rc5. So if you feel such rare usage is find even in_hardirq(), then I suppose there's nothing more to do here? Jason
Am 06.07.22 um 18:26 schrieb Jason A. Donenfeld: > Hi Harald, > > On Wed, Jul 06, 2022 at 06:18:27PM +0200, Harald Freudenberger wrote: >> On 2022-07-05 18:27, Holger Dengler wrote: >>> Hi Jason, >>> >>> On 05/07/2022 17:11, Jason A. Donenfeld wrote: >>>> Hi Holger, >>>> >>>> On Tue, Jul 05, 2022 at 04:58:30PM +0200, Holger Dengler wrote: >>>>> It is true, that the performance of the instruction is not really >>>>> relevant, but only for calls outside of an interrupt context. I did >>>>> some ftrace logging for the s390_random_get_seed_long() calls, and - >>>>> as you said - there are a few calls per minute. But there was also >>>>> some repeating calls in interrupt context. On systems with a huge >>>>> interrupt load, this can cause severe performance impacts. I've no >>>> >>>> It'd be interesting to know more about this. The way you get >>>> arch_random_get_seed_long() from irq context is: >>>> >>>> get_random_{bytes,int,long,u32,u64}() >>>> crng_make_state() >>>> crng_reseed() <-- Rarely >>>> extract_entropy() >>>> arch_get_random_seed_long() >>>> >>>> So if an irq user of get_random_xx() is the unlucky one in the minute >>>> span who has to call crng_reseed() then, yea, that'll happen. But I >>>> wonder about this luck aspect. What scenarios are you seeing where >>>> this >>>> happens all the time? Which driver is using random bytes *so* commonly >>>> from irq context? Not that, per say, there's anything wrong with that, >>>> but it could be eyebrow raising, and might point to de facto solutions >>>> that mostly take care of this. >>> >>> I saw a few calls in interrupt context during my tracing, but I didn't >>> look to see which ones they were. Let me figure that out in the next >>> few days and provide more information on that. >>> >>>> One such direction might be making a driver that does such a thing do >>>> it >>>> a little bit less, somehow. Another direction would be preferring >>>> non-irqs to handle crng_reseed(), but not disallowing irqs entirely, >>>> with a patch something like the one below. Or maybe there are other >>>> ideas. >>> >>> Reduce the number of trng in interrupt context is a possibility, but - >>> in my opinion - only one single trng instruction call in interrupt >>> context in one too much. >>> >>> For the moment, I would propose to drop the buffering but also return >>> false, if arch_random_get_seed_long() is called in interrupt context. >>> >>> diff --git a/arch/s390/include/asm/archrandom.h >>> b/arch/s390/include/asm/archrandom.h >>> index 2c6e1c6ecbe7..711357bdc464 100644 >>> --- a/arch/s390/include/asm/archrandom.h >>> +++ b/arch/s390/include/asm/archrandom.h >>> @@ -32,7 +32,8 @@ static inline bool __must_check >>> arch_get_random_int(unsigned int *v) >>> >>> static inline bool __must_check arch_get_random_seed_long(unsigned >>> long *v) >>> { >>> - if (static_branch_likely(&s390_arch_random_available)) { >>> + if (static_branch_likely(&s390_arch_random_available) && >>> + !in_interrupt()) { >>> cpacf_trng(NULL, 0, (u8 *)v, sizeof(*v)); >>> atomic64_add(sizeof(*v), &s390_arch_random_counter); >>> return true; >>> >>> (on-top of your commit, without our buffering patch) >>> >>>> >>>> But all this is to say that having some more of the "mundane" details >>>> about this might actually help us. >>>> >>>> Jason >>>> >>>> diff --git a/drivers/char/random.c b/drivers/char/random.c >>>> index e3dd1dd3dd22..81df8cdf2a62 100644 >>>> --- a/drivers/char/random.c >>>> +++ b/drivers/char/random.c >>>> @@ -270,6 +270,9 @@ static bool crng_has_old_seed(void) >>>> static bool early_boot = true; >>>> unsigned long interval = CRNG_RESEED_INTERVAL; >>>> >>>> + if (in_hardirq()) >>>> + interval += HZ * 10; >>>> + >>>> if (unlikely(READ_ONCE(early_boot))) { >>>> time64_t uptime = ktime_get_seconds(); >>>> if (uptime >= CRNG_RESEED_INTERVAL / HZ * 2) >>>> >> >> Hi Holger and Jason >> I tried to find out what is the reason of the invocations in interrupt >> context. >> First I have to admit that there is in fact not much of >> arch_get_random_seed_long() >> invocation any more in the recent kernel (5.19-rc5). I see about 100 >> invocations >> within 10 minutes with an LPAR running some qperf and dd dumps on dasds >> test load. >> About half of these invocations is in interrupt context. I >> dump_stack()ed some of >> these and I always catch the function >> kfence_guarded_alloc() >> prandom_u32_max() >> prandom_u32() >> get_random_u32() >> _get_random_bytes() >> crng_make_state() >> crng_reseed() >> extract_entropy() >> arch_get_random_seed_long() >> >> However, with so few invocations it should not make any harm when there >> is a >> even very expensive trng() invocation in interrupt context. >> >> But I think we should check, if this is really something to backport to >> the older >> kernels where arch_get_random_seed_long() is called really frequency. > > I backported the current random.c design to old kernels, so the > situation there should be the same as in 5.19-rc5. > > So if you feel such rare usage is find even in_hardirq(), then I suppose > there's nothing more to do here? I think up to 190µs in interrupt can result in unwanted latencies. Yes it does not happen that often and it is smaller than most timeslices of hypervisors. So I would likely turn that question around what happens if we return false if in_hardirq is true? Any noticeable delays in code that uses random numbers?
Hi Christian, On Wed, Jul 06, 2022 at 08:29:49PM +0200, Christian Borntraeger wrote: > >> However, with so few invocations it should not make any harm when there > >> is a > >> even very expensive trng() invocation in interrupt context. > >> > >> But I think we should check, if this is really something to backport to > >> the older > >> kernels where arch_get_random_seed_long() is called really frequency. > > > > I backported the current random.c design to old kernels, so the > > situation there should be the same as in 5.19-rc5. > > > > So if you feel such rare usage is find even in_hardirq(), then I suppose > > there's nothing more to do here? > > I think up to 190µs in interrupt can result in unwanted latencies. Yes it does not > happen that often and it is smaller than most timeslices of hypervisors. > So I would likely turn that question around > what happens if we return false if in_hardirq is true? Any noticeable > delays in code that uses random numbers? I think I already answered this here with mention of the hwrng driver: https://lore.kernel.org/all/YsSAn2qXqlFkS5sH@zx2c4.com/ Anyway, I would recommend keeping it available in all contexts, so that s390 isn't a special case to keep in mind, and because Harald said he couldn't measure an actual problem existing for real. Plus, it's not as though we're talking about RT kernels or a problem that would affect RT. But if you're convinced that even the extremely rare case poses a issue, doing the !in_hardirq() thing won't be the end of the world either and is partly mitigated by the hwrng driver mentioned earlier. So I think it's mostly up to you guys on what the tradeoffs are and what's realistic and such. Jason
diff --git a/arch/s390/crypto/arch_random.c b/arch/s390/crypto/arch_random.c index 1f2d40993c4d..07abd09ec759 100644 --- a/arch/s390/crypto/arch_random.c +++ b/arch/s390/crypto/arch_random.c @@ -2,17 +2,66 @@ /* * s390 arch random implementation. * - * Copyright IBM Corp. 2017, 2020 + * Copyright IBM Corp. 2017, 2022 * Author(s): Harald Freudenberger + * Holger Dengler <dengler@linux.ibm.com> + * + * The trng instruction on s390 is very expensive and has a constant runtime + * for up to 32 bytes. Each trng call will get 32 bytes of (conditioned) true + * random data, which are buffered in a lock-protected array and delivered to + * up to four callers. To avoid long running trng calls in the interrupt + * context, a refill is skipped there. Also prevent the blocking of callers in + * interrupt context by a refill. */ #include <linux/kernel.h> #include <linux/atomic.h> #include <linux/random.h> +#include <linux/spinlock.h> #include <linux/static_key.h> #include <asm/cpacf.h> DEFINE_STATIC_KEY_FALSE(s390_arch_random_available); +static struct { + unsigned long data[BITS_TO_LONGS(32 * BITS_PER_BYTE)]; + int idx; +} trngbuf; +static DEFINE_SPINLOCK(trngbuf_lock); + atomic64_t s390_arch_random_counter = ATOMIC64_INIT(0); EXPORT_SYMBOL(s390_arch_random_counter); + +bool s390_get_random_seed_long(unsigned long *v) +{ + unsigned long flags; + + if (in_interrupt()) { + if (!spin_trylock_irqsave(&trngbuf_lock, flags)) + return false; + } else { + spin_lock_irqsave(&trngbuf_lock, flags); + } + /* trngbuf is locked */ + + if (--trngbuf.idx < 0) { + /* buffer empty */ + if (in_interrupt()) { + /* delegate refill to another caller */ + spin_unlock_irqrestore(&trngbuf_lock, flags); + return false; + } + + /* refill buffer */ + cpacf_trng(NULL, 0, (u8 *)trngbuf.data, sizeof(trngbuf.data)); + trngbuf.idx = ARRAY_SIZE(trngbuf.data) - 1; + } + + /* deliver buffered random data */ + *v = trngbuf.data[trngbuf.idx]; + spin_unlock_irqrestore(&trngbuf_lock, flags); + + atomic64_add(sizeof(long), &s390_arch_random_counter); + return true; +} +EXPORT_SYMBOL(s390_get_random_seed_long); diff --git a/arch/s390/include/asm/archrandom.h b/arch/s390/include/asm/archrandom.h index 2c6e1c6ecbe7..a1e365de6b33 100644 --- a/arch/s390/include/asm/archrandom.h +++ b/arch/s390/include/asm/archrandom.h @@ -2,7 +2,7 @@ /* * Kernel interface for the s390 arch_random_* functions * - * Copyright IBM Corp. 2017, 2020 + * Copyright IBM Corp. 2017, 2022 * * Author: Harald Freudenberger <freude@de.ibm.com> * @@ -20,6 +20,8 @@ DECLARE_STATIC_KEY_FALSE(s390_arch_random_available); extern atomic64_t s390_arch_random_counter; +bool s390_get_random_seed_long(unsigned long *v); + static inline bool __must_check arch_get_random_long(unsigned long *v) { return false; @@ -32,21 +34,14 @@ static inline bool __must_check arch_get_random_int(unsigned int *v) static inline bool __must_check arch_get_random_seed_long(unsigned long *v) { - if (static_branch_likely(&s390_arch_random_available)) { - cpacf_trng(NULL, 0, (u8 *)v, sizeof(*v)); - atomic64_add(sizeof(*v), &s390_arch_random_counter); - return true; - } + if (static_branch_likely(&s390_arch_random_available)) + return s390_get_random_seed_long(v); + return false; } static inline bool __must_check arch_get_random_seed_int(unsigned int *v) { - if (static_branch_likely(&s390_arch_random_available)) { - cpacf_trng(NULL, 0, (u8 *)v, sizeof(*v)); - atomic64_add(sizeof(*v), &s390_arch_random_counter); - return true; - } return false; }