diff mbox series

[8/8] crypto: api - make the algorithm lookup priorize non-larvals

Message ID 20211003181413.12465-9-nstange@suse.de (mailing list archive)
State Changes Requested
Delegated to: Herbert Xu
Headers show
Series crypto: api - priorize tested algorithms in lookup | expand

Commit Message

Nicolai Stange Oct. 3, 2021, 6:14 p.m. UTC
crypto_alg_mod_lookup() invokes the crypto_larval_lookup() helper
to run the actual search for matching crypto_alg implementation and
larval entries. The latter is currently considering only the individual
entries' relative ->cra_priority for determining which one out of multiple
matches to return. This means that it would potentially dismiss a matching
crypto_alg implementation in working state in favor of some pending
testing larval of higher ->cra_priority. Now, if the testmgr instance
invoked asynchronously on that testing larval came to the conclusion that
it should mark the tests as failed, any pending crypto_alg_mod_lookup()
waiting for it would be made to fail as well with -EAGAIN.

In summary, crypto_alg_mod_lookup() can fail spuriously with -EAGAIN even
though an implementation in working state would have been available, namely
if the testmgr asynchronously marked another, competing implementation of
higher ->cra_priority as failed.

This is normally not a problem at all with upstream, because the situation
where one algorithm passed its tests, but another competing one failed to
do so, would indicate a bug anyway.

However, for downstream distributions seeking FIPS certification, simply
amending the list in crypto/testmgr.c with ->fips_allowed = 0 entries
matching on ->cra_driver_name would provide a convenient way of
selectively blacklisting implementations from drivers/crypto in fips
mode. Note that in this scenario failure of competing crypto_alg
implementations would become more common, in particular during device
enumeration. If the algorithm in question happened to be needed for e.g.
module signature verification, module loading could spuriously fail during
bootup, which is certainly not desired.

For transparency: this has not actually been observed, I merely came to
the conclusion that it would be possible by reading the code.

Make crypto_alg_lookup() run an additional search for non-larval matches
upfront in the common case that the request has been made for
CRYPTO_ALG_TESTED instances.

Signed-off-by: Nicolai Stange <nstange@suse.de>
---
 crypto/api.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

Comments

Herbert Xu Oct. 8, 2021, 11:54 a.m. UTC | #1
On Sun, Oct 03, 2021 at 08:14:13PM +0200, Nicolai Stange wrote:
> crypto_alg_mod_lookup() invokes the crypto_larval_lookup() helper
> to run the actual search for matching crypto_alg implementation and
> larval entries. The latter is currently considering only the individual
> entries' relative ->cra_priority for determining which one out of multiple
> matches to return. This means that it would potentially dismiss a matching
> crypto_alg implementation in working state in favor of some pending
> testing larval of higher ->cra_priority. Now, if the testmgr instance
> invoked asynchronously on that testing larval came to the conclusion that
> it should mark the tests as failed, any pending crypto_alg_mod_lookup()
> waiting for it would be made to fail as well with -EAGAIN.
> 
> In summary, crypto_alg_mod_lookup() can fail spuriously with -EAGAIN even
> though an implementation in working state would have been available, namely
> if the testmgr asynchronously marked another, competing implementation of
> higher ->cra_priority as failed.
> 
> This is normally not a problem at all with upstream, because the situation
> where one algorithm passed its tests, but another competing one failed to
> do so, would indicate a bug anyway.
> 
> However, for downstream distributions seeking FIPS certification, simply
> amending the list in crypto/testmgr.c with ->fips_allowed = 0 entries
> matching on ->cra_driver_name would provide a convenient way of
> selectively blacklisting implementations from drivers/crypto in fips
> mode. Note that in this scenario failure of competing crypto_alg
> implementations would become more common, in particular during device
> enumeration. If the algorithm in question happened to be needed for e.g.
> module signature verification, module loading could spuriously fail during
> bootup, which is certainly not desired.
> 
> For transparency: this has not actually been observed, I merely came to
> the conclusion that it would be possible by reading the code.
> 
> Make crypto_alg_lookup() run an additional search for non-larval matches
> upfront in the common case that the request has been made for
> CRYPTO_ALG_TESTED instances.
> 
> Signed-off-by: Nicolai Stange <nstange@suse.de>
> ---
>  crypto/api.c | 21 ++++++++++++++++++++-
>  1 file changed, 20 insertions(+), 1 deletion(-)

It's not clear that this new behaviour is desirable.  For example,
when we construct certain complex algorithms, they may depend on a
generic version of that same algorithm as a fallback.  We do not
want users to get the generic version while the better version is
being tested.

Can you please explain what your failure scenario and perhaps we
can come up with another way of resolving your problem?

Thanks,
Nicolai Stange Oct. 11, 2021, 8:34 a.m. UTC | #2
Herbert Xu <herbert@gondor.apana.org.au> writes:

> On Sun, Oct 03, 2021 at 08:14:13PM +0200, Nicolai Stange wrote:
>> crypto_alg_mod_lookup() invokes the crypto_larval_lookup() helper
>> to run the actual search for matching crypto_alg implementation and
>> larval entries. The latter is currently considering only the individual
>> entries' relative ->cra_priority for determining which one out of multiple
>> matches to return. This means that it would potentially dismiss a matching
>> crypto_alg implementation in working state in favor of some pending
>> testing larval of higher ->cra_priority. Now, if the testmgr instance
>> invoked asynchronously on that testing larval came to the conclusion that
>> it should mark the tests as failed, any pending crypto_alg_mod_lookup()
>> waiting for it would be made to fail as well with -EAGAIN.
>> 
>> In summary, crypto_alg_mod_lookup() can fail spuriously with -EAGAIN even
>> though an implementation in working state would have been available, namely
>> if the testmgr asynchronously marked another, competing implementation of
>> higher ->cra_priority as failed.
>> 
>> This is normally not a problem at all with upstream, because the situation
>> where one algorithm passed its tests, but another competing one failed to
>> do so, would indicate a bug anyway.
>> 
>> However, for downstream distributions seeking FIPS certification, simply
>> amending the list in crypto/testmgr.c with ->fips_allowed = 0 entries
>> matching on ->cra_driver_name would provide a convenient way of
>> selectively blacklisting implementations from drivers/crypto in fips
>> mode. Note that in this scenario failure of competing crypto_alg
>> implementations would become more common, in particular during device
>> enumeration. If the algorithm in question happened to be needed for e.g.
>> module signature verification, module loading could spuriously fail during
>> bootup, which is certainly not desired.
>> 
>> For transparency: this has not actually been observed, I merely came to
>> the conclusion that it would be possible by reading the code.
>> 
>> Make crypto_alg_lookup() run an additional search for non-larval matches
>> upfront in the common case that the request has been made for
>> CRYPTO_ALG_TESTED instances.
>> 
>> Signed-off-by: Nicolai Stange <nstange@suse.de>
>> ---
>>  crypto/api.c | 21 ++++++++++++++++++++-
>>  1 file changed, 20 insertions(+), 1 deletion(-)

Hi Herbert,

> It's not clear that this new behaviour is desirable.  For example,
> when we construct certain complex algorithms, they may depend on a
> generic version of that same algorithm as a fallback.  We do not
> want users to get the generic version while the better version is
> being tested.

Ah I see, you mean something like having 3+ providers of "algXY",
 - algXY_driver0, ->cra_priority = 100
 - algXY_driver1, ->cra_priority = 200
 - algXY_driver1, ->cra_priority = 300
where the latter needs a different "algXY" as a fallback?

Hmm yes, I haven't thought of this and my patch here would indeed break
it.


> Can you please explain what your failure scenario and perhaps we
> can come up with another way of resolving your problem?

In order to keep a FIPS certification manageable in terms of scope,
we're looking for a way to disable everything under drivers/crypto iff
fips_enabled == 1. The most convenient way to achieve this downstream
would be to add dummy entries to testmgr.c like so:

static int alg_test_nop(const struct alg_test_desc *desc, const char *driver,
			u32 type, u32 mask)
{
	/* Succeed in non-FIPS mode. */
	return 0;
}

static const struct alg_test_desc alg_test_descs[] = {
	...,
	{
		.alg = "sha256-padlock-nano",
		.test = alg_test_nop,
		.fips_allowed = 0,
	},
        ...
};

The concern is about e.g the following sequence of events during boot:
- "sha256-padlock-nano" gets registered, the test gets scheduled.
- An unrelated modprobe is getting invoked. The signature verification
  code, e.g pkcs7_digest(), requests "sha256", finds the pending
  "sha256-padlock-nano" testing larval and puts itself in a wait for it.
- The scheduled "sha256-padlock-nano" test gets to run and, as per
  ->fips_allowed == 0, is forced to fail with -EINVAL.
- pkcs7_digest() wakes up and fails with -EAGAIN due to the "failed"
  testing larval.
- The unrelated modprobe fails even though sha256-generic would
  have been available all the time.

FWIW, I picked sha256-padlock-nano and modprobe at random for the sake
of providing an example here -- I haven't checked in detail, but I guess
that e.g. the combination of dm-crypt + a number of different FIPS
approved algorithms could be similarly susceptible, too.

So to recap, my initial approach for working around the above was to
make crypto_alg_lookup() to skip over testing larvals in an additional,
first search. As you said, this would break the "fallback" scenario
though.

As an alternative, how about not doing the additional search for
non-larvals upfront, but only as a resort in case crypto_larval_wait()
returned failure instead?

But granted, the scenario above is not an issue at all for upstream with
a pristine testmgr.c and it would be quite relatable if you wouldn't
like to get bothered with any of this. I'm only bringing it up because
others might perhaps come across this as well...


Thanks!

Nicolai
Herbert Xu Oct. 22, 2021, 11:51 a.m. UTC | #3
On Mon, Oct 11, 2021 at 10:34:11AM +0200, Nicolai Stange wrote:
>
> In order to keep a FIPS certification manageable in terms of scope,
> we're looking for a way to disable everything under drivers/crypto iff
> fips_enabled == 1. The most convenient way to achieve this downstream
> would be to add dummy entries to testmgr.c like so:

How about testing based on the flag CRYPTO_ALG_KERN_DRIVER_ONLY?
That flag is meant to correspond to pretty much exactly drivers/crypto.

Cheers,
Nicolai Stange Oct. 27, 2021, 9:59 a.m. UTC | #4
Herbert Xu <herbert@gondor.apana.org.au> writes:

> On Mon, Oct 11, 2021 at 10:34:11AM +0200, Nicolai Stange wrote:
>>
>> In order to keep a FIPS certification manageable in terms of scope,
>> we're looking for a way to disable everything under drivers/crypto iff
>> fips_enabled == 1. The most convenient way to achieve this downstream
>> would be to add dummy entries to testmgr.c like so:
>
> How about testing based on the flag CRYPTO_ALG_KERN_DRIVER_ONLY?
> That flag is meant to correspond to pretty much exactly
> drivers/crypto.

Hmm, I checked a couple of crypto/drivers and it seems like not all are
setting this flag: random examples would include cavium/nitrox/,
chelsio/, padlock*.c, vmx/, ...

Many thanks!

Nicolai
Herbert Xu Oct. 28, 2021, 2:42 a.m. UTC | #5
On Wed, Oct 27, 2021 at 11:59:26AM +0200, Nicolai Stange wrote:
>
> Hmm, I checked a couple of crypto/drivers and it seems like not all are
> setting this flag: random examples would include cavium/nitrox/,
> chelsio/, padlock*.c, vmx/, ...

For padlock and vmx this is correct as they're just like aesni.

The others are omissions and should be fixed.

Thanks,
diff mbox series

Patch

diff --git a/crypto/api.c b/crypto/api.c
index 594c494a27d9..4251eedef668 100644
--- a/crypto/api.c
+++ b/crypto/api.c
@@ -239,8 +239,25 @@  static struct crypto_alg *crypto_alg_lookup(const char *name, u32 type,
 	struct crypto_alg *alg;
 
 	down_read(&crypto_alg_sem);
+	/*
+	 * If the request is for an algorithm having passed its
+	 * selftests, try to serve it with a matching instance already
+	 * in operational state. That is, skip pending larvals in a
+	 * first search: for these it is not guaranteed that they will
+	 * eventually turn out successful and it would be a pity to
+	 * potentially fail the request even though a working
+	 * implementation would have been available. If OTOH the
+	 * request is *not* for an algorithm having passed its
+	 * selftest (yet), no larval can match it anyway, so the
+	 * CRYPTO_ALG_LARVAL in the mask below won't make a
+	 * difference.
+	 */
+	alg = __crypto_alg_lookup(name, type, mask | CRYPTO_ALG_LARVAL);
+	if (alg || !(type & CRYPTO_ALG_TESTED))
+		goto out;
+
 	alg = __crypto_alg_lookup(name, type, mask);
-	if (!alg && (type & CRYPTO_ALG_TESTED)) {
+	if (!alg) {
 		/*
 		 * Check whether there's an instance which failed the
 		 * selftests in order to avoid pointless retries.
@@ -254,6 +271,8 @@  static struct crypto_alg *crypto_alg_lookup(const char *name, u32 type,
 			alg = ERR_PTR(-ELIBBAD);
 		}
 	}
+
+out:
 	up_read(&crypto_alg_sem);
 
 	return alg;