Message ID | 20170606005108.5646-5-Jason@zx2c4.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Jun 06, 2017 at 02:50:59AM +0200, Jason A. Donenfeld wrote: > Otherwise, we might be seeding the RNG using bad randomness, which is > dangerous. > > Cc: Herbert Xu <herbert@gondor.apana.org.au> > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> > --- > crypto/rng.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/crypto/rng.c b/crypto/rng.c > index f46dac5288b9..e042437e64b4 100644 > --- a/crypto/rng.c > +++ b/crypto/rng.c > @@ -48,12 +48,14 @@ int crypto_rng_reset(struct crypto_rng *tfm, const u8 *seed, unsigned int slen) > if (!buf) > return -ENOMEM; > > - get_random_bytes(buf, slen); > + err = get_random_bytes_wait(buf, slen); Note that crypto_rng_reset() is called by big_key_init() in security/keys/big_key.c as a late_initcall(). So if we are on a system where the crng doesn't get initialized until during the system boot scripts, and big_key is compiled directly into the kernel, the boot could end up deadlocking. There may be other instances of where crypto_rng_reset() is called by an initcall, so big_key_init() may not be an exhaustive enumeration of potential problems. But this is an example of why the synchronous API, although definitely much more convenient, can end up being a trap for the unwary.... - Ted
Hey Ted, On Tue, Jun 6, 2017 at 5:00 AM, Theodore Ts'o <tytso@mit.edu> wrote: > Note that crypto_rng_reset() is called by big_key_init() in > security/keys/big_key.c as a late_initcall(). So if we are on a > system where the crng doesn't get initialized until during the system > boot scripts, and big_key is compiled directly into the kernel, the > boot could end up deadlocking. > > There may be other instances of where crypto_rng_reset() is called by > an initcall, so big_key_init() may not be an exhaustive enumeration of > potential problems. But this is an example of why the synchronous > API, although definitely much more convenient, can end up being a trap > for the unwary.... Thanks for pointing this out. I'll look more closely into it and see if I can figure out a good way of approaching this. Indeed you're right -- that we have to be really quite careful every time we use the synchronous API. For this reason, I separated things out into the wait_for_random_bytes and then the wrapper around wait_for_random_bytes+get_random_bytes of get_random_bytes_wait. The idea here would be that drivers could place a single wait_for_random_bytes at some userspace entry point -- a configuration ioctl, for example -- and then try to ensure that all calls to get_random_bytes are ordered _after_ that wait_for_random_bytes call. While this pattern doesn't fix all cases of unseeded get_random_bytes calls -- we'll need to do some module loading order cleverness for that, as we discussed in the other thread -- I think this pattern will fix an acceptable amount of call sites, as seen here in this patchset, that it makes it worthwhile. Having it, too, I think would encourage other new drivers to think about when their calls to get_random_bytes happens, and if it's possible for them to defer it until after a userspace-blocking call to wait_for_random_bytes. Anyway, I'll look into and fix up the problem you mentioned. Looking forward to your feedback on the other patches here. Regards, Jason
On Tue, Jun 06, 2017 at 05:56:20AM +0200, Jason A. Donenfeld wrote: > Hey Ted, > > On Tue, Jun 6, 2017 at 5:00 AM, Theodore Ts'o <tytso@mit.edu> wrote: > > Note that crypto_rng_reset() is called by big_key_init() in > > security/keys/big_key.c as a late_initcall(). So if we are on a > > system where the crng doesn't get initialized until during the system > > boot scripts, and big_key is compiled directly into the kernel, the > > boot could end up deadlocking. > > > > There may be other instances of where crypto_rng_reset() is called by > > an initcall, so big_key_init() may not be an exhaustive enumeration of > > potential problems. But this is an example of why the synchronous > > API, although definitely much more convenient, can end up being a trap > > for the unwary.... > > Thanks for pointing this out. I'll look more closely into it and see > if I can figure out a good way of approaching this. I don't think big_key even needs randomness at init time. The 'big_key_rng' could just be removed and big_key_gen_enckey() changed to call get_random_bytes(). (Or get_random_bytes_wait(), I guess; it's only reachable via the keyring syscalls.) It's going to take a while to go through all 217 users of get_random_bytes() like this, though... It's really a shame there's no way to guarantee good randomness at boot time. Eric
Hi Eric, On Tue, Jun 6, 2017 at 6:44 AM, Eric Biggers <ebiggers3@gmail.com> wrote: > I don't think big_key even needs randomness at init time. The 'big_key_rng' > could just be removed and big_key_gen_enckey() changed to call > get_random_bytes(). (Or get_random_bytes_wait(), I guess; it's only reachable > via the keyring syscalls.) That sounds good to me. I'll go ahead and make these changes, and will add you to the Cc for the patch. You'll find the latest version in here: https://git.zx2c4.com/linux-dev/log/?h=jd/rng-blocker > It's going to take a while to go through all 217 users of get_random_bytes() > like this, though... It's really a shame there's no way to guarantee good > randomness at boot time. Yes, I agree whole-heartedly. A lot of people have proposals for fixing the direct idea of entropy gathering, but for whatever reason, Ted hasn't merged stuff. I think Stephan (CCd) rewrote big critical sections of the RNG, called LRNG, and published a big paper for peer review and did a lot of cool engineering, but for some reason this hasn't been integrated. I look forward to movement on this front in the future, if it ever happens. Would be great. However, in lieu of that, I agree that playing whack a mole with all call sites is mega arduous and error prone. In my original message to Ted about this, I proposed instead a more global approach of introducing an rng_init() to complement things like late_init() and device_init() and such. The idea here would be two-fold: - Modules that are built in would only be loaded as a callback to the initialization of the RNG. An API for that already exists. - Modules that are external would simply block userspace in request_module until the RNG is initialized. This patch series adds that kind of API. If I understood correctly, Ted was worried that this might introduce some headaches with module load ordering. However, IMHO, dealing with the very few use cases of ordering issues is going to be far less arduous than playing whack-a-mole with every call site. But, given the fact that we still do need a blocking API (this patch series), I decided to go with implementing this first, and then second attacking the more difficult issue of adding rng_init(). So hopefully a combination of this patch series and the next one will amount to something workable. Long term, though, I think we need Stephan's work, in one form or another, to be merged. Jason
Hey again Eric, One thing led to another and I wound up just rewriting all the crypto in big_keys.c. I'll include this for v4: https://git.zx2c4.com/linux-dev/commit/?h=jd/rng-blocker&id=886ff283b9808aecb14aa8e397da8496a9635aed Not only was the use of crypto/rng inappropriate, but the decision to go with aes-ecb is shocking. Seeing that this author had no other commits in the tree, and that all subsequent commits that mentioned his name were cleaning up his mess, I just went ahead and removed both the crypto/rng misusage and changed from aes-ecb to aes-gcm. Anyway, I'll wait for some more reviews on v3, and then this can be reviewed for v4. Regards, Jason
On Tue, Jun 06, 2017 at 02:34:43PM +0200, Jason A. Donenfeld wrote: > > Yes, I agree whole-heartedly. A lot of people have proposals for > fixing the direct idea of entropy gathering, but for whatever reason, > Ted hasn't merged stuff. I think Stephan (CCd) rewrote big critical > sections of the RNG, called LRNG, and published a big paper for peer > review and did a lot of cool engineering, but for some reason this > hasn't been integrated. I look forward to movement on this front in > the future, if it ever happens. Would be great. So it's not clear what you mean by Stephan's work. It can be separated into multiple pieces; one is simply using a mechanism which can be directly mapped to NIST's DRBG framework. I don't believe this actually adds any real security per se, but it can make it easier to get certification for people who care about getting FIPS certification. Since I've seen a lot of snake oil and massive waste of taxpayer and industry dollars by FIPS certification firms, it's not a thing I particularly find particularly compelling. The second bit is "Jitter Entropy". The problem I have with that is there isn't any convincing explanation about why it can't be predicted to some degree of accuracy with someone who understands what's going on with Intel's cache architecture. (And this isn't just me, I've talked to people who work at Intel and they are at best skeptical of the whole idea.) To be honest, there is a certain amount of this which is true with harvesting interrupt timestamps, since for at least some interrupts (in the worst case, the timer interrupt, especially on SOC's where all of the clocks are generated from a single master oscillator) at least some of the unpredictability is due to fact that the attacker needs to understand what's going on with cache hits and misses, and that in turn is impacted by compiler code generation, yadda, yadda, yadda. The main thing then with trying to get entropy from sampling from the environment is to have a mixing function that you trust, and that you capture enough environmental data which hopefully is not available to the attacker. So for example, radio strength measurements from the WiFi data is not necessarily secret, but hopefully the information of whether the cell phone is on your desk, or in your knapsack, either on the desk, or under the desk, etc., is not available the analyst sitting in Fort Meade (or Beijing, if you trust the NSA but not the Ministry of State Security :-). The judgement call is when you've gathered enough environmental data (whether it is from CPU timing and cache misses if you are using Jitter Entropy), or interupt timing, etc., is when you have enough unpredictable data that it will be sufficient to protect you against the attacker. We try to make some guesses of when we've gathered a "bit" of entropy, but it's important to be humble here. We don't have a theoretical framework for *any* of this, so the way we gather metrics is really not all that scientific. We also need to be careful not to fall into the trap of wishful thinking. Yes, if we can say that the CRNG is fully initialized before the init scripts are started, or even very early in the initcall, then we can say yay! Problem solved!! But just because someone *claims* that JitterEntropy will solve the problem, doesn't necessarily mean it really does. I'm not accusing Stephan of trying to deliberately sell snake oil; just that at least some poeople have looked at it dubiously, and I would at least prefer to gather a lot more environmental noise, and be more conservative before saying that we're sure the CRNG is fully initialized. The other approach is to find a way to have initialized "seed" entropy which we can count on at every boot. The problem is that this is very much dependent on how the bootloader works. It's easy to say "store it in the kernel", but where the kernel is stored varies greatly from architecture to architecture. In some cases, the kernel can stored in ROM, where it can't be modified at all. It might be possible, for example, to store a cryptographic key in a UEFI boot-services variable, where the key becomes inaccessible after the boot-time services terminate. But you also need either a reliable time-of-day clock, or a reliable counter which is incremented each time the system that boots, and which can't be messed with by an attacker, or trivially reset by a clueless user/sysadmin. Or maybe we can have a script that is run at shutdown and boot-up that stashes 32 bytes of entropy in a reserved space accessible to GRUB, and which GRUB then passes to the kernel using an extension to the Linux/x86 Boot Protocol. (See Documentation/x86/boot.txt) Quite frankly, I think this is actually a more useful and fruitful path than either the whack-a-mole audit of all of the calls to get_random_bytes() or adding a blocking variant to get_random_bytes() (since in my opinion this becomes yet another version of whack-a-mole, since each change to use the blocking variant requires an audit of how the randomness is used, or where the function is called). The reality though is that Linux is a volunteer effort, and so all a maintainer can control is (a) is personal time, (b) whatever resources his company may have entrusted him with, (c) trying to pursuade others in the development community to do things (for which this e-mail is an example :-), and ultimately, (d) the maintainer can say NO to a patch. I try as much as possible to do (c), but the reality is that /dev/random is sexiest thing, and to be honest, I suspect that there are many more sources of vulnerability which are easier for an attacker than attacking the random number generator. So it may in fact be _rational_ for people who are working on hardening the kernel to focus on other areas. That being said, we should be trying to improve things on all fronts, not just the sexy ones. > Ted about this, I proposed instead a more global approach of > introducing an rng_init() to complement things like late_init() and > device_init() and such. The idea here would be two-fold: > > - Modules that are built in would only be loaded as a callback to the > initialization of the RNG. An API for that already exists. > - Modules that are external would simply block userspace in > request_module until the RNG is initialized. This patch series adds > that kind of API. > > If I understood correctly, Ted was worried that this might introduce > some headaches with module load ordering. My concern is while it might work on one architecture, it would break on another architecture. And even on one architecture, it might be that it works on bare metal hardware, but on in a virtual environment, there aren't enough interrupts for us to fully initialize the CRNG. So it might be that Fedora with its kernel config file work fine in one area, but it mysteriously fails if you install Fedora in a VM --- and worse, maybe it works in Cloud Platform A, but not Cloud Platform B. (And then the rumor mongers will come out and claim that the failure on one Cloud Platform is due to the fact that some set of enigneers work for one company versus another... not that we've seen any kind of rants like that on the kernel-hardening mailing list! :-) I think this is a soluble problem, but it may be rather tricky. For example, it may be that for a certain class of init calls, even though they are in subsystems that are compiled into the kernel, those init calls perhaps could be deferred so they are running in parallel with the init scripts. (Or maybe we could just require that certain kernel modules can *only* be compiled as modules if they use rng_init --- although that may get annoying for those of us who like being able to build custom configured monolithic kernels. So I'd prefer the first possibility if at all possible.) Cheers, - Ted
Hi Jason, On Tue, Jun 06, 2017 at 05:23:04PM +0200, Jason A. Donenfeld wrote: > Hey again Eric, > > One thing led to another and I wound up just rewriting all the crypto > in big_keys.c. I'll include this for v4: > > https://git.zx2c4.com/linux-dev/commit/?h=jd/rng-blocker&id=886ff283b9808aecb14aa8e397da8496a9635aed > > Not only was the use of crypto/rng inappropriate, but the decision to > go with aes-ecb is shocking. Seeing that this author had no other > commits in the tree, and that all subsequent commits that mentioned > his name were cleaning up his mess, I just went ahead and removed both > the crypto/rng misusage and changed from aes-ecb to aes-gcm. > > Anyway, I'll wait for some more reviews on v3, and then this can be > reviewed for v4. > > Regards, > Jason I agree that the use of ECB mode in big_key is broken, and thanks for trying to fix it! I think using GCM is good, but please leave a very conspicuous comment where the nonce is being set to 0, noting that it's safe only because a unique key is used to encrypt every big_key *and* the big_keys are not updatable (via an .update method in the key_type), resulting in each GCM key being used for only a single encryption. Also, I think you should send this to the keyrings mailing list and maintainer so it can be discussed and merged separately from your RNG changes. Eric
On Tue, Jun 6, 2017 at 7:03 PM, Theodore Ts'o <tytso@mit.edu> wrote: > So it's not clear what you mean by Stephan's work. I just meant that there's a guy out there who seems really motivated to work on this stuff in detail, but hasn't seen too much love, AFAIK. I'm sure there's an interesting technical discussion to be had about his contributions. > The reality though is that Linux is a volunteer effort Yep! And here I am volunteering, writing some code, in my free time, just for willies. I hope you'll be a kind, helpful, and supportive maintainer who welcomes contributions and discussion. > So it may in > fact be _rational_ for people who are working on hardening the kernel > to focus on other areas. > improve things on all fronts, not just the sexy ones. I don't want people to use some aspects of a module I'm writing before get_random_bytes() will return good randomness. You made some point about what's sexy and what isn't and what is rational for people to work on, but I think I missed the thrust of it. I'm working on this particular problem now with get_random_bytes, as something motivated by simple considerations, not complex ulterior motives. But anyway, moving on... > I think this is a soluble problem, but it may be rather tricky. For > example, it may be that for a certain class of init calls, even though > they are in subsystems that are compiled into the kernel, those init > calls perhaps could be deferred so they are running in parallel with > the init scripts. (Or maybe we could just require that certain kernel > modules can *only* be compiled as modules if they use rng_init --- > although that may get annoying for those of us who like being able to > build custom configured monolithic kernels. So I'd prefer the first > possibility if at all possible.) Right, indeed this is tricky, and you bring up good points about complex interactions on different architectures. Based on your comments about whack-a-mole, I think we're mostly on the same page about how arduous this is going to be to fix. My plan is to first get in this simple blocking API, because while it doesn't solve _every_ use case, there are particular use cases that are helped nearly perfectly by it. Namely, places where userspace is going to configure something. So, soon after I finish up this testing I've been doing all day on my series, I'll post v4, and hopefully we can get that merged, and then move onto interesting other solutions for the general problem. Regards, Jason
On Tue, Jun 6, 2017 at 7:26 PM, Eric Biggers <ebiggers3@gmail.com> wrote: > I agree that the use of ECB mode in big_key is broken, and thanks for trying to > fix it! I think using GCM is good, but please leave a very conspicuous comment > where the nonce is being set to 0, noting that it's safe only because a unique > key is used to encrypt every big_key *and* the big_keys are not updatable (via > an .update method in the key_type), resulting in each GCM key being used for > only a single encryption. Good idea. I'll make a large comment about this. > > Also, I think you should send this to the keyrings mailing list and maintainer > so it can be discussed and merged separately from your RNG changes. Yea, that seems like a good idea. I'll separate things out right now.
Am Dienstag, 6. Juni 2017, 19:03:19 CEST schrieb Theodore Ts'o: Hi Theodore, > On Tue, Jun 06, 2017 at 02:34:43PM +0200, Jason A. Donenfeld wrote: > > Yes, I agree whole-heartedly. A lot of people have proposals for > > fixing the direct idea of entropy gathering, but for whatever reason, > > Ted hasn't merged stuff. I think Stephan (CCd) rewrote big critical > > sections of the RNG, called LRNG, and published a big paper for peer > > review and did a lot of cool engineering, but for some reason this > > hasn't been integrated. I look forward to movement on this front in > > the future, if it ever happens. Would be great. > > So it's not clear what you mean by Stephan's work. It can be > separated into multiple pieces; one is simply using a mechanism which > can be directly mapped to NIST's DRBG framework. I don't believe this > actually adds any real security per se, but it can make it easier to > get certification for people who care about getting FIPS > certification. Since I've seen a lot of snake oil and massive waste > of taxpayer and industry dollars by FIPS certification firms, it's not > a thing I particularly find particularly compelling. > > The second bit is "Jitter Entropy". The problem I have with that is > there isn't any convincing explanation about why it can't be predicted > to some degree of accuracy with someone who understands what's going > on with Intel's cache architecture. (And this isn't just me, I've > talked to people who work at Intel and they are at best skeptical of > the whole idea.) My LRNG approach covers many more concerns rather than just using the Jitter RNG or using the DRBG. Using the Jitter RNG should just beef up the lacking entropy at boot time. Irrespective of what you think of it, it will not destroy existing entropy. Using the DRBG should allow crypto offloading and provides a small API for other users to plug in their favorite DRNG (like the ChaCha20 DRNG). I think I mentioned several times already which are the core concerns I have. But allow me to re-iterate them again as I have not seen any answer so far: - There is per definition a high correlation between interrupts and HID/block device events. The legacy /dev/random by far weights HID/block device noise higher in entropy than interrupts and awards interrupts hardly any entropy. But let us face it, HID and block devices are just a "derivative" of interrupts. Instead of weighting HID/block devices higher than interrupts, we should get rid of them when counting entropy and focus on interrupts. Interrupts fare very well even in virtualized environments where the legacy / dev/random hardly collects any entropy. Note, this interrupt behavior in virtual environments was the core motivation for developing the LRNG. - By not having such collision problems and the related low validations of entropy from interrupts, a much faster initialization with sufficient entropy is possible. This is now visible with the current initialization of the ChaCha20 part of the legacy /dev/random. That comes, however, at the cost that HID/disk events happening before the ChaCha20 is initialized are affected by the aforementioned correlation. Just to say it again, correlation destroys entropy. - The entropy estimate is based on first, second and third derivative of Jiffies. As Jiffies hardly contribute any entropy per event, using this number for an entropy estimation for an event is just coincidence that the legacy / dev/random underestimates entropy. And then using such coincidential estimates to apply an asymptotic calculation how much the entropy estimator is increased, is not really helpful. - The entropy transport within the legacy /dev/random allows small quantums (down to 8 bit minimums) of entropy to be transported. Such approach is a concern which can be illustrated with a pathological analogy (I understand that this pathological case is not present for the legacy /dev/random, but it illustrates the problem with small quantities of entropy). Assume that only one bit of entropy is conveyed from input_pool to blocking_pool during each read operation by an attacker from /dev/random (and assume that the attacker an read that one bit). Now, if 128 bits of entropy are transported with 128 individual transactions where the attacker can read data from the RNG between each transport, the final crypto strength is only 2 * 128 bits and not 2^128 bits. Thus, transports of entropy should be done in larger quantities (like 128 bits at least). - The DRNGs are fully testable by itself. The DRBG is tested using the kernel crypto API's testmgr using blessed test vectors. The ChaCha20 DRNG is implemented such that it can be extracted in a user space app to study it further (such extraction of the ChaCha20 into a standalone DRNG is provided at [1]). I tried to address those issues in the LRNG. Finally, I am very surprised that I get hardly any answers on patches to random.c let alone that any changes to random.c will be applied at all. Lastly, it is very easy to call an approach (JitterRNG) flawed, but I would like to see some back-up of such claim after the analysis that is provided on that topic. This analysis refers to the wait states between individual CPU components as the root of the noise knowing that the ISA != the hardware CPU instruction set and the evidence collected on various different CPUs. [1] https://github.com/smuellerDD/chacha20_drng Ciao Stephan
On Tue, Jun 6, 2017 at 7:57 PM, Stephan Müller <smueller@chronox.de> wrote: > Finally, I am very surprised that I get hardly any answers on patches to > random.c let alone that any changes to random.c will be applied at all. FWIW, this is my biggest concern too. You seem willing to work on this difficult problem. I'm simultaneously working on a different but related issue. I hope we're enabled to contribute.
On Tue, 06 Jun 2017, Theodore Ts'o wrote: > It might be possible, for example, to store a cryptographic key in a > UEFI boot-services variable, where the key becomes inaccessible after > the boot-time services terminate. But you also need either a reliable > time-of-day clock, or a reliable counter which is incremented each > time the system that boots, and which can't be messed with by an > attacker, or trivially reset by a clueless user/sysadmin. > > Or maybe we can have a script that is run at shutdown and boot-up that > stashes 32 bytes of entropy in a reserved space accessible to GRUB, > and which GRUB then passes to the kernel using an extension to the > Linux/x86 Boot Protocol. (See Documentation/x86/boot.txt) On that same idea, one could add an early_initramfs handler for entropy data. One could also ensure the kernel command line is used to feed some entropy for the CRNG init (for all I know, this is already being done), and add a do-nothing parameter (that gets sanitized away after use) that can be used to add entropy to that command line. Something like random.someentropy=<bootloader-supplied random stuff goes here>. This might be more generic than the x86 boot protocol reserved space... On the better bootloaders, an initramfs segment can be loaded independently (and you can have as many as required), which makes an early_initramfs a more palatable vector to inject large amounts of entropy into the next boot than, say, modifying the kernel image directly at every boot/shutdown to stash entropy in there somewhere. These are all easy to implement, I just don't know how *useful* they would really be.
On Tue, Jun 06, 2017 at 07:19:10PM -0300, Henrique de Moraes Holschuh wrote: > On that same idea, one could add an early_initramfs handler for entropy > data. > > One could also ensure the kernel command line is used to feed some > entropy for the CRNG init (for all I know, this is already being done), > and add a do-nothing parameter (that gets sanitized away after use) that > can be used to add entropy to that command line. Something like > random.someentropy=<bootloader-supplied random stuff goes here>. This > might be more generic than the x86 boot protocol reserved space... > > On the better bootloaders, an initramfs segment can be loaded > independently (and you can have as many as required), which makes an > early_initramfs a more palatable vector to inject large amounts of > entropy into the next boot than, say, modifying the kernel image > directly at every boot/shutdown to stash entropy in there somewhere. > > These are all easy to implement, I just don't know how *useful* they > would really be. +1 The kernel side for all of these are relatively easy. The hard part is to do an end-to-end solution. Which means the changes to the bootloader, the initramfs tools, etc. As I recall one issue with doing things in the initramfs scripts is that for certain uses of randomness (such as the stack canary), it's hard to change the valid canary after it's been used for a userspace process, since you might have to support more than one valid canary value until all of the proceses using the original (not necessarily cryptographically initialized) stack canary has exited. So while the upside is that it might not require any kernel changes to inject the randomness into the non-blocking pool via one of the initramfs scripts, from an overall simplicity for the kernel users, it's nice if we can initialize the CRNG as early possible --- in the ideal world, even before KASLR has been initialized, which means ****really**** early in the boot process. That's the advantage of doing it as part of the Linux/x86 boot protocol, since it's super simple to get at the entropy seed. It doesn't require parsing the kernel command-line. The tradeoff is that it is x86 specific, and the ARM, ppcle folks, etc. would have to implement their own way of injecting entropy super-early into the boot process. One advantage of doing this somewhat harder thing is that **all** of the issues around re-architecting a new rng_init initcall level, and dealing with module load order, etc., disappear if we can figure out a way to initialize the entropy pre-KASLR. Yes, it's harder; but it solves all of the issues at one fell swoop. - Ted
Am Mittwoch, 7. Juni 2017, 00:19:10 CEST schrieb Henrique de Moraes Holschuh: Hi Henrique, > On that same idea, one could add an early_initramfs handler for entropy > data. Any data that comes from outside during the boot process, be it some NVRAM location, the /var/lib...seed file for /dev/random or other approaches are viewed by a number of folks to have zero bits of entropy. I.e. this data is nice for stirring the pool, but is not considered to help our entropy problem. Ciao Stephan
On Wed, 07 Jun 2017, Stephan Müller wrote: > Am Mittwoch, 7. Juni 2017, 00:19:10 CEST schrieb Henrique de Moraes Holschuh: > > On that same idea, one could add an early_initramfs handler for entropy > > data. > > Any data that comes from outside during the boot process, be it some NVRAM > location, the /var/lib...seed file for /dev/random or other approaches are > viewed by a number of folks to have zero bits of entropy. > > I.e. this data is nice for stirring the pool, but is not considered to help > our entropy problem. Stirring the pool is actually the objective, not entropy accounting. Let's not lose sight of what is really important. But yes, you are quite correct in that it won't help anything that would like to block due to entropy accouting, or which needs to be certain about the amount of randomness in the pools.
> On the better bootloaders, an initramfs segment can be loaded > independently (and you can have as many as required), which makes an > early_initramfs a more palatable vector to inject large amounts of > entropy into the next boot than, say, modifying the kernel image > directly at every boot/shutdown to stash entropy in there somewhere. Modifying the kernel image on storage isn't compatible with verified boot so it's not really a solution. The kernel, initrd and rest of the OS are signed and verified on operating systems like Android, Android Things, ChromeOS and many embedded devices, etc. putting some basic effort into security. I didn't really understand the device tree approach and mentioned a few times before. Passing via the kernel cmdline is a lot simpler than modifying the device tree in-memory and persistent modification isn't an option unless verified boot is missing anyway.
On Wed, Jun 07, 2017 at 01:00:25PM -0400, Daniel Micay wrote: > > On the better bootloaders, an initramfs segment can be loaded > > independently (and you can have as many as required), which makes an > > early_initramfs a more palatable vector to inject large amounts of > > entropy into the next boot than, say, modifying the kernel image > > directly at every boot/shutdown to stash entropy in there somewhere. [...] > I didn't really understand the device tree approach and mentioned a > few times before. Passing via the kernel cmdline is a lot simpler than > modifying the device tree in-memory and persistent modification isn't > an option unless verified boot is missing anyway. I might be missing something here, but the command line is inside of the device tree, at /chosen/bootargs, so modifying the kernel command line *is* modifying the device tree in-memory. For arm64, we have a /chosen/kaslr-seed property that we hope FW/bootloaders fill in, and similar could be done for some initial entropy, provided appropriate HW/FW support. Thanks, Mark.
On Tue, Jun 06, 2017 at 01:03:19PM -0400, Theodore Ts'o wrote: > The other approach is to find a way to have initialized "seed" entropy > which we can count on at every boot. The problem is that this is very > much dependent on how the bootloader works. It's easy to say "store > it in the kernel", but where the kernel is stored varies greatly from > architecture to architecture. In some cases, the kernel can stored in > ROM, where it can't be modified at all. > > It might be possible, for example, to store a cryptographic key in a > UEFI boot-services variable, where the key becomes inaccessible after > the boot-time services terminate. But you also need either a reliable > time-of-day clock, or a reliable counter which is incremented each > time the system that boots, and which can't be messed with by an > attacker, or trivially reset by a clueless user/sysadmin. FWIW, EFI has an (optional) EFI_RNG_PROTOCOL, that we currently use to seed the kernel's entropy pool. The EFI stub creates a config table with the entropy, which the kernel reads. This is re-seeded prior to kexec() to avoid the entropy being recycled. See commits: 636259880a7e7d34 ("efi: Add support for seeding the RNG from a UEFI config table") 568bc4e87033d232 (" efi/arm*/libstub: Invoke EFI_RNG_PROTOCOL to seed the UEFI RNG table") Unfortunately, I beleive that support for the protocol is currently rare in practice. Thanks, Mark.
On Wed, Jun 07, 2017 at 07:00:17AM +0200, Stephan Müller wrote: > > On that same idea, one could add an early_initramfs handler for entropy > > data. > > Any data that comes from outside during the boot process, be it some NVRAM > location, the /var/lib...seed file for /dev/random or other approaches are > viewed by a number of folks to have zero bits of entropy. The Open BSD folks would disagree with you. They've designed their whole system around saving entropy at shutdown, reading it as early as possible by the bootloader, and then as soon as possible after the reboot, to overwrite and reinitialize the entropy seed stored on disk so that on a reboot after a crash. They consider this good enough to assume that their CRNG is *always* strongly initialized. I'll let you have that discussion/argument with Theo de Raadt, though. Be warned that he has opinions about security that are almost certainly as strong (and held with the same level of certainty) as a certain Brad Spengler... - Ted
On Wed, 2017-06-07 at 18:26 +0100, Mark Rutland wrote: > On Wed, Jun 07, 2017 at 01:00:25PM -0400, Daniel Micay wrote: > > > On the better bootloaders, an initramfs segment can be loaded > > > independently (and you can have as many as required), which makes > > > an > > > early_initramfs a more palatable vector to inject large amounts of > > > entropy into the next boot than, say, modifying the kernel image > > > directly at every boot/shutdown to stash entropy in there > > > somewhere. > > [...] > > > I didn't really understand the device tree approach and mentioned a > > few times before. Passing via the kernel cmdline is a lot simpler > > than > > modifying the device tree in-memory and persistent modification > > isn't > > an option unless verified boot is missing anyway. > > I might be missing something here, but the command line is inside of > the > device tree, at /chosen/bootargs, so modifying the kernel command line > *is* modifying the device tree in-memory. > > For arm64, we have a /chosen/kaslr-seed property that we hope > FW/bootloaders fill in, and similar could be done for some initial > entropy, provided appropriate HW/FW support. > > Thanks, > Mark. I was assuming it was simpler since bootloaders are already setting it, but it seems I'm wrong about that.
On Tue, Jun 06, 2017 at 05:56:20AM +0200, Jason A. Donenfeld wrote: > Hey Ted, > > On Tue, Jun 6, 2017 at 5:00 AM, Theodore Ts'o <tytso@mit.edu> wrote: > > Note that crypto_rng_reset() is called by big_key_init() in > > security/keys/big_key.c as a late_initcall(). So if we are on a > > system where the crng doesn't get initialized until during the system > > boot scripts, and big_key is compiled directly into the kernel, the > > boot could end up deadlocking. > > > > There may be other instances of where crypto_rng_reset() is called by > > an initcall, so big_key_init() may not be an exhaustive enumeration of > > potential problems. But this is an example of why the synchronous > > API, although definitely much more convenient, can end up being a trap > > for the unwary.... > > Thanks for pointing this out. I'll look more closely into it and see > if I can figure out a good way of approaching this. Would it work for wait_for_random_bytes() to include a WARN_ON(system_state < SYSTEM_RUNNING); to catch those kinds of cases? - Kevin
diff --git a/crypto/rng.c b/crypto/rng.c index f46dac5288b9..e042437e64b4 100644 --- a/crypto/rng.c +++ b/crypto/rng.c @@ -48,12 +48,14 @@ int crypto_rng_reset(struct crypto_rng *tfm, const u8 *seed, unsigned int slen) if (!buf) return -ENOMEM; - get_random_bytes(buf, slen); + err = get_random_bytes_wait(buf, slen); + if (err) + goto out; seed = buf; } err = crypto_rng_alg(tfm)->seed(tfm, seed, slen); - +out: kzfree(buf); return err; }
Otherwise, we might be seeding the RNG using bad randomness, which is dangerous. Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> --- crypto/rng.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)