mbox series

[v4,0/6] crypto: switch to crypto API for ESSIV generation

Message ID 20190621080918.22809-1-ard.biesheuvel@arm.com (mailing list archive)
Headers show
Series crypto: switch to crypto API for ESSIV generation | expand

Message

Ard Biesheuvel June 21, 2019, 8:09 a.m. UTC
From: Ard Biesheuvel <ard.biesheuvel@linaro.org>

This series creates an ESSIV template that produces a skcipher or AEAD
transform based on a tuple of the form '<skcipher>,<cipher>,<shash>'
(or '<aead>,<cipher>,<shash>' for the AEAD case). It exposes the
encapsulated sync or async skcipher/aead by passing through all operations,
while using the cipher/shash pair to transform the input IV into an ESSIV
output IV.

This matches what both users of ESSIV in the kernel do, and so it is proposed
as a replacement for those, in patches #2 and #4.

This code has been tested using the fscrypt test suggested by Eric
(generic/549), as well as the mode-test script suggested by Milan for
the dm-crypt case. I also tested the aead case in a virtual machine,
but it definitely needs some wider testing from the dm-crypt experts.

Open issues (more discussion needed):
- given that hardware already exists that can perform en/decryption including
  ESSIV generation of a range of blocks, it would be useful to encapsulate
  this in the ESSIV template, and teach at least dm-crypt how to use it
  (given that it often processes 8 512-byte sectors at a time)
- given the above, it may or may not make sense to keep the accelerated
  implementation in patch #6 (and teach it to increment the LE counter after
  each sector)

Changes since v3:
- address various review comments from Eric on patch #1
- use Kconfig's 'imply' instead of 'select' to permit CRYPTO_ESSIV to be
  enabled as a module or disabled entirely even if fscrypt is compiled in (#2)
- fix an issue in the AEAD encrypt path caused by the IV being clobbered by
  the inner skcipher before the hmac was being calculated

Changes since v2:
- fixed a couple of bugs that snuck in after I'd done the bulk of my
  testing
- some cosmetic tweaks to the ESSIV template skcipher setkey function
  to align it with the aead one
- add a test case for essiv(cbc(aes),aes,sha256)
- add an accelerated implementation for arm64 that combines the IV
  derivation and the actual en/decryption in a single asm routine

Scroll down for tcrypt speed test result comparing the essiv template
with the asm implementation. Bare cbc(aes) tests included for reference
as well. Taken on a 2GHz Cortex-A57 (AMD Seattle)

Code can be found here
https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=essiv-v4

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Eric Biggers <ebiggers@google.com>
Cc: dm-devel@redhat.com
Cc: linux-fscrypt@vger.kernel.org
Cc: Gilad Ben-Yossef <gilad@benyossef.com>
Cc: Milan Broz <gmazyland@gmail.com>

Ard Biesheuvel (6):
  crypto: essiv - create wrapper template for ESSIV generation
  fs: crypto: invoke crypto API for ESSIV handling
  md: dm-crypt: infer ESSIV block cipher from cipher string directly
  md: dm-crypt: switch to ESSIV crypto API template
  crypto: essiv - add test vector for essiv(cbc(aes),aes,sha256)
  crypto: arm64/aes - implement accelerated ESSIV/CBC mode

 arch/arm64/crypto/aes-glue.c  | 129 ++++
 arch/arm64/crypto/aes-modes.S |  99 +++
 crypto/Kconfig                |   4 +
 crypto/Makefile               |   1 +
 crypto/essiv.c                | 639 ++++++++++++++++++++
 crypto/tcrypt.c               |   9 +
 crypto/testmgr.c              |   6 +
 crypto/testmgr.h              | 208 +++++++
 drivers/md/Kconfig            |   1 +
 drivers/md/dm-crypt.c         | 237 ++------
 fs/crypto/Kconfig             |   1 +
 fs/crypto/crypto.c            |   5 -
 fs/crypto/fscrypt_private.h   |   9 -
 fs/crypto/keyinfo.c           |  88 +--
 14 files changed, 1141 insertions(+), 295 deletions(-)
 create mode 100644 crypto/essiv.c

Comments

Ard Biesheuvel June 23, 2019, 9:30 a.m. UTC | #1
On Fri, 21 Jun 2019 at 10:09, Ard Biesheuvel <ard.biesheuvel@arm.com> wrote:
>
> From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>
...
>
> - given that hardware already exists that can perform en/decryption including
>   ESSIV generation of a range of blocks, it would be useful to encapsulate
>   this in the ESSIV template, and teach at least dm-crypt how to use it
>   (given that it often processes 8 512-byte sectors at a time)

I thought about this a bit more, and it occurred to me that the
capability of issuing several sectors at a time and letting the lower
layers increment the IV between sectors is orthogonal to whether ESSIV
is being used or not, and so it probably belongs in another wrapper.

I.e., if we define a skcipher template like dmplain64le(), which is
defined as taking a sector size as part of the key, and which
increments a 64 LE counter between sectors if multiple are passed, it
can be used not only for ESSIV but also for XTS, which I assume can be
h/w accelerated in the same way.

So with that in mind, I think we should decouple the multi-sector
discussion and leave it for a followup series, preferably proposed by
someone who also has access to some hardware to prototype it on.
Herbert Xu June 23, 2019, 10:12 a.m. UTC | #2
On Sun, Jun 23, 2019 at 11:30:41AM +0200, Ard Biesheuvel wrote:
>
> So with that in mind, I think we should decouple the multi-sector
> discussion and leave it for a followup series, preferably proposed by
> someone who also has access to some hardware to prototype it on.

Yes that makes sense.

Thanks,
Milan Broz June 24, 2019, 6:52 a.m. UTC | #3
On 23/06/2019 12:12, Herbert Xu wrote:
> On Sun, Jun 23, 2019 at 11:30:41AM +0200, Ard Biesheuvel wrote:
>>
>> So with that in mind, I think we should decouple the multi-sector
>> discussion and leave it for a followup series, preferably proposed by
>> someone who also has access to some hardware to prototype it on.
> 
> Yes that makes sense.

Yes.

And TBH, the most important optimization for dm-crypt in this case
is processing 8 512-bytes sectors in 4k encryption block (because page
cache will generate page-sized bios) and with XTS mode and linear IV (plain64),
not ESSIV.

Dm-crypt can use 4k sectors directly, there are only two
blockers - you need LUKS2 to support it, and many devices
just do not advertise physical 4k sectors (many SSDs).
So switching to 4k could cause some problems with partial 4k writes
(after a crash or power-fail).

The plan for the dm-crypt side is more to focus on using 4k native
sectors than this micro-optimization in HW.

Milan
Eric Biggers June 26, 2019, 4:49 a.m. UTC | #4
On Sun, Jun 23, 2019 at 11:30:41AM +0200, Ard Biesheuvel wrote:
> On Fri, 21 Jun 2019 at 10:09, Ard Biesheuvel <ard.biesheuvel@arm.com> wrote:
> >
> > From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> >
> ...
> >
> > - given that hardware already exists that can perform en/decryption including
> >   ESSIV generation of a range of blocks, it would be useful to encapsulate
> >   this in the ESSIV template, and teach at least dm-crypt how to use it
> >   (given that it often processes 8 512-byte sectors at a time)
> 
> I thought about this a bit more, and it occurred to me that the
> capability of issuing several sectors at a time and letting the lower
> layers increment the IV between sectors is orthogonal to whether ESSIV
> is being used or not, and so it probably belongs in another wrapper.
> 
> I.e., if we define a skcipher template like dmplain64le(), which is
> defined as taking a sector size as part of the key, and which
> increments a 64 LE counter between sectors if multiple are passed, it
> can be used not only for ESSIV but also for XTS, which I assume can be
> h/w accelerated in the same way.
> 
> So with that in mind, I think we should decouple the multi-sector
> discussion and leave it for a followup series, preferably proposed by
> someone who also has access to some hardware to prototype it on.
> 

This makes sense, but if we're going to leave that functionality out of the
essiv template, I think we should revisit whether the essiv template takes a
__le64 sector number vs. just an IV matching the cipher block size.  To me,
defining the IV to be a __le64 seems like a layering violation.  Also, dm-crypt
and fscrypt already know how to zero-pad the sector number to form the full 16
byte IV, and your patch just duplicates that logic in the essiv template too,
which makes it more complicated than necessary.

E.g., the following incremental patch for the skcipher case would simplify it:

(You'd have to do it for the AEAD case too.)

diff --git a/crypto/essiv.c b/crypto/essiv.c
index 8e80814ec7d6..737e92ebcbd8 100644
--- a/crypto/essiv.c
+++ b/crypto/essiv.c
@@ -57,11 +57,6 @@ struct essiv_tfm_ctx {
 	struct crypto_shash		*hash;
 };
 
-struct essiv_skcipher_request_ctx {
-	u8				iv[MAX_INNER_IV_SIZE];
-	struct skcipher_request		skcipher_req;
-};
-
 struct essiv_aead_request_ctx {
 	u8				iv[2][MAX_INNER_IV_SIZE];
 	struct scatterlist		src[4], dst[4];
@@ -161,39 +156,32 @@ static void essiv_skcipher_done(struct crypto_async_request *areq, int err)
 	skcipher_request_complete(req, err);
 }
 
-static void essiv_skcipher_prepare_subreq(struct skcipher_request *req)
+static int essiv_skcipher_crypt(struct skcipher_request *req, bool enc)
 {
 	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
 	const struct essiv_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
-	struct essiv_skcipher_request_ctx *rctx = skcipher_request_ctx(req);
-	struct skcipher_request *subreq = &rctx->skcipher_req;
-
-	memset(rctx->iv, 0, crypto_cipher_blocksize(tctx->essiv_cipher));
-	memcpy(rctx->iv, req->iv, crypto_skcipher_ivsize(tfm));
+	struct skcipher_request *subreq = skcipher_request_ctx(req);
 
-	crypto_cipher_encrypt_one(tctx->essiv_cipher, rctx->iv, rctx->iv);
+	crypto_cipher_encrypt_one(tctx->essiv_cipher, req->iv, req->iv);
 
 	skcipher_request_set_tfm(subreq, tctx->u.skcipher);
 	skcipher_request_set_crypt(subreq, req->src, req->dst, req->cryptlen,
-				   rctx->iv);
+				   req->iv);
 	skcipher_request_set_callback(subreq, skcipher_request_flags(req),
 				      essiv_skcipher_done, req);
+
+	return enc ? crypto_skcipher_encrypt(subreq) :
+		     crypto_skcipher_decrypt(subreq);
 }
 
 static int essiv_skcipher_encrypt(struct skcipher_request *req)
 {
-	struct essiv_skcipher_request_ctx *rctx = skcipher_request_ctx(req);
-
-	essiv_skcipher_prepare_subreq(req);
-	return crypto_skcipher_encrypt(&rctx->skcipher_req);
+	return essiv_skcipher_crypt(req, true);
 }
 
 static int essiv_skcipher_decrypt(struct skcipher_request *req)
 {
-	struct essiv_skcipher_request_ctx *rctx = skcipher_request_ctx(req);
-
-	essiv_skcipher_prepare_subreq(req);
-	return crypto_skcipher_decrypt(&rctx->skcipher_req);
+	return essiv_skcipher_crypt(req, false);
 }
 
 static void essiv_aead_done(struct crypto_async_request *areq, int err)
@@ -300,24 +288,14 @@ static int essiv_skcipher_init_tfm(struct crypto_skcipher *tfm)
 	struct essiv_instance_ctx *ictx = skcipher_instance_ctx(inst);
 	struct essiv_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
 	struct crypto_skcipher *skcipher;
-	unsigned int subreq_size;
 	int err;
 
-	BUILD_BUG_ON(offsetofend(struct essiv_skcipher_request_ctx,
-				 skcipher_req) !=
-		     sizeof(struct essiv_skcipher_request_ctx));
-
 	skcipher = crypto_spawn_skcipher(&ictx->u.skcipher_spawn);
 	if (IS_ERR(skcipher))
 		return PTR_ERR(skcipher);
 
-	subreq_size = FIELD_SIZEOF(struct essiv_skcipher_request_ctx,
-				   skcipher_req) +
-		      crypto_skcipher_reqsize(skcipher);
-
-	crypto_skcipher_set_reqsize(tfm,
-				    offsetof(struct essiv_skcipher_request_ctx,
-					     skcipher_req) + subreq_size);
+	crypto_skcipher_set_reqsize(tfm, sizeof(struct skcipher_request) +
+				    crypto_skcipher_reqsize(skcipher));
 
 	err = essiv_init_tfm(ictx, tctx);
 	if (err) {
@@ -567,9 +545,9 @@ static int essiv_create(struct crypto_template *tmpl, struct rtattr **tb)
 
 		skcipher_inst->alg.min_keysize	= crypto_skcipher_alg_min_keysize(skcipher_alg);
 		skcipher_inst->alg.max_keysize	= crypto_skcipher_alg_max_keysize(skcipher_alg);
-		skcipher_inst->alg.ivsize	= ESSIV_IV_SIZE;
-		skcipher_inst->alg.chunksize	= skcipher_alg->chunksize;
-		skcipher_inst->alg.walksize	= skcipher_alg->walksize;
+		skcipher_inst->alg.ivsize	= crypto_skcipher_alg_ivsize(skcipher_alg);
+		skcipher_inst->alg.chunksize	= crypto_skcipher_alg_chunksize(skcipher_alg);
+		skcipher_inst->alg.walksize	= crypto_skcipher_alg_walksize(skcipher_alg);
 
 		skcipher_inst->free		= essiv_skcipher_free_instance;