From patchwork Mon Sep 3 11:29:42 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ondrej Mosnacek X-Patchwork-Id: 10586005 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0FAA214BD for ; Mon, 3 Sep 2018 14:32:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E982A295A3 for ; Mon, 3 Sep 2018 14:32:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DB9CB295DD; Mon, 3 Sep 2018 14:32:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id D3722295A3 for ; Mon, 3 Sep 2018 14:32:20 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.26]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B093A308212B; Mon, 3 Sep 2018 14:32:19 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id E166130912F5; Mon, 3 Sep 2018 14:32:18 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 0DF284BB7F; Mon, 3 Sep 2018 14:32:16 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.26]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id w83BTwi0030351 for ; Mon, 3 Sep 2018 07:29:58 -0400 Received: by smtp.corp.redhat.com (Postfix) id 2831E3091344; Mon, 3 Sep 2018 11:29:58 +0000 (UTC) Delivered-To: dm-devel@redhat.com Received: from mx1.redhat.com (ext-mx14.extmail.prod.ext.phx2.redhat.com [10.5.110.43]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 2082B30912F5 for ; Mon, 3 Sep 2018 11:29:54 +0000 (UTC) Received: from mail-wm0-f70.google.com (mail-wm0-f70.google.com [74.125.82.70]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 36A3C308FB8F for ; Mon, 3 Sep 2018 11:29:54 +0000 (UTC) Received: by mail-wm0-f70.google.com with SMTP id j129-v6so446714wmj.3 for ; Mon, 03 Sep 2018 04:29:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=QYR+DVd2wXsoXsE6WzsK8aYblh/GiMwadddSgWy+0EY=; b=gBZ3ByA47/TEtVQfLGVR4iNDVGdSB2cbww5wZ/ulD+lerl9CXoRkf0H/GPPs+XsGSw oUzCRwG+5PBDhmpfJxXt9BBUl4UEO87LFWeTOMzpWZrj9uztCzE2zira9Z9Fw0/EEm/M wKFQHEAnGNrOJFRqOl/90aZttLCobn1FOagQVqsS/UC/UXkdVLzCru3hwn/T+mo1eLSS WfRO8OsztI789q9lZEt/5Q/I1fLvDZjYf98xqcFVvvS+o+8YGbwvpC/ygOl6JYq5EsAm EJIUQnrhXjbNDv6jCIeOo38j3rUzikJ3POSHNOKitw5UPjioZVCpU928FZHrUmJUMg1m 7vCA== X-Gm-Message-State: APzg51CIbDjoqQAuVNqYLfl07pxq0yOdOwgLGwLdQbvgDnX2aMWUc1C+ YSiT/8mve4Bl6x4UYmxQq1MF3XUQBPsVP6a444Lnq2HbRx/5K7+gH9d8SJ4X3ePok4xjFQVyLuI VzMOSFR2wfSMp7zU= X-Received: by 2002:a1c:ce0b:: with SMTP id e11-v6mr4807642wmg.47.1535974192589; Mon, 03 Sep 2018 04:29:52 -0700 (PDT) X-Google-Smtp-Source: ANB0VdauwvQr4lwWf3v79Umq1d/m8PtOQkAHPxj7bLub0cTgWMbyp/9WFR72IhZ0OYclS0t6YB7mcA== X-Received: by 2002:a1c:ce0b:: with SMTP id e11-v6mr4807626wmg.47.1535974192229; Mon, 03 Sep 2018 04:29:52 -0700 (PDT) Received: from localhost.localdomain.com (nat-pool-brq-t.redhat.com. [213.175.37.10]) by smtp.gmail.com with ESMTPSA id l130-v6sm19306788wmd.16.2018.09.03.04.29.51 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 03 Sep 2018 04:29:51 -0700 (PDT) From: Ondrej Mosnacek To: Herbert Xu Date: Mon, 3 Sep 2018 13:29:42 +0200 Message-Id: <20180903112942.18979-1-omosnace@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.26 X-loop: dm-devel@redhat.com X-Mailman-Approved-At: Mon, 03 Sep 2018 10:32:14 -0400 Cc: dm-devel@redhat.com, Mikulas Patocka , linux-crypto@vger.kernel.org, Ondrej Mosnacek Subject: [dm-devel] [PATCH] crypto: xts - Drop use of auxiliary buffer X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Scanned-By: MIMEDefang 2.84 on 10.5.11.26 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.42]); Mon, 03 Sep 2018 14:32:20 +0000 (UTC) X-Virus-Scanned: ClamAV using ClamSMTP Hi Herbert, Mikulas, I noticed the discussion about using kmalloc() inside crypto code and it made me wonder if the code in xts.c can actually be simplified to not require kmalloc() at all, while not badly affecting performace. I played around with it a little and it turns out we can drop the whole caching of tweak blocks, reducing code size by ~200 lines and not only preserve, but even improve the performance in some cases. See the full patch below. Obviously, this doesn't solve the general issue of using kmalloc() in crypto API routines, but it does resolve the original reported problem and also makes the XTS code easier to maintain. Regards, Ondrej ----->8----- Since commit acb9b159c784 ("crypto: gf128mul - define gf128mul_x_* in gf128mul.h"), the gf128mul_x_*() functions are very fast and therefore caching the computed XTS tweaks has only negligible advantage over computing them twice. In fact, since the current caching implementation limits the size of the calls to the child ecb(...) algorithm to PAGE_SIZE (usually 4096 B), it is often actually slower than the simple recomputing implementation. This patch simplifies the XTS template to recompute the XTS tweaks from scratch in the second pass and thus also removes the need to allocate a dynamic buffer using kmalloc(). As discussed at [1], the use of kmalloc causes deadlocks with dm-crypt. PERFORMANCE RESULTS I measured time to encrypt/decrypt a memory buffer of varying sizes with xts(ecb-aes-aesni) using a tool I wrote ([2]) and the results suggest that after this patch the performance is either better or comparable for both small and large buffers. Note that there is a lot of noise in the measurements, but the overall difference is easy to see. Old code: ALGORITHM KEY (b) DATA (B) TIME ENC (ns) TIME DEC (ns) xts(aes) 256 64 331 328 xts(aes) 384 64 332 333 xts(aes) 512 64 338 348 xts(aes) 256 512 889 920 xts(aes) 384 512 1019 993 xts(aes) 512 512 1032 990 xts(aes) 256 4096 2152 2292 xts(aes) 384 4096 2453 2597 xts(aes) 512 4096 3041 2641 xts(aes) 256 16384 9443 8027 xts(aes) 384 16384 8536 8925 xts(aes) 512 16384 9232 9417 xts(aes) 256 32768 16383 14897 xts(aes) 384 32768 17527 16102 xts(aes) 512 32768 18483 17322 New code: ALGORITHM KEY (b) DATA (B) TIME ENC (ns) TIME DEC (ns) xts(aes) 256 64 328 324 xts(aes) 384 64 324 319 xts(aes) 512 64 320 322 xts(aes) 256 512 476 473 xts(aes) 384 512 509 492 xts(aes) 512 512 531 514 xts(aes) 256 4096 2132 1829 xts(aes) 384 4096 2357 2055 xts(aes) 512 4096 2178 2027 xts(aes) 256 16384 6920 6983 xts(aes) 384 16384 8597 7505 xts(aes) 512 16384 7841 8164 xts(aes) 256 32768 13468 12307 xts(aes) 384 32768 14808 13402 xts(aes) 512 32768 15753 14636 [1] https://lkml.org/lkml/2018/8/23/1315 [2] https://gitlab.com/omos/linux-crypto-bench Cc: Mikulas Patocka Signed-off-by: Ondrej Mosnacek --- crypto/xts.c | 258 ++++++--------------------------------------------- 1 file changed, 30 insertions(+), 228 deletions(-) diff --git a/crypto/xts.c b/crypto/xts.c index 12284183bd20..6c49e76df8ca 100644 --- a/crypto/xts.c +++ b/crypto/xts.c @@ -26,8 +26,6 @@ #include #include -#define XTS_BUFFER_SIZE 128u - struct priv { struct crypto_skcipher *child; struct crypto_cipher *tweak; @@ -39,19 +37,7 @@ struct xts_instance_ctx { }; struct rctx { - le128 buf[XTS_BUFFER_SIZE / sizeof(le128)]; - le128 t; - - le128 *ext; - - struct scatterlist srcbuf[2]; - struct scatterlist dstbuf[2]; - struct scatterlist *src; - struct scatterlist *dst; - - unsigned int left; - struct skcipher_request subreq; }; @@ -96,265 +82,81 @@ static int setkey(struct crypto_skcipher *parent, const u8 *key, return err; } -static int post_crypt(struct skcipher_request *req) +static int xor_tweak(struct rctx *rctx, struct skcipher_request *req) { - struct rctx *rctx = skcipher_request_ctx(req); - le128 *buf = rctx->ext ?: rctx->buf; - struct skcipher_request *subreq; const int bs = XTS_BLOCK_SIZE; struct skcipher_walk w; - struct scatterlist *sg; - unsigned offset; + le128 t = rctx->t; int err; - subreq = &rctx->subreq; - err = skcipher_walk_virt(&w, subreq, false); + err = skcipher_walk_virt(&w, req, false); while (w.nbytes) { unsigned int avail = w.nbytes; + le128 *wsrc; le128 *wdst; + wsrc = w.src.virt.addr; wdst = w.dst.virt.addr; do { - le128_xor(wdst, buf++, wdst); - wdst++; + le128_xor(wdst++, &t, wsrc++); + gf128mul_x_ble(&t, &t); } while ((avail -= bs) >= bs); err = skcipher_walk_done(&w, avail); } - rctx->left -= subreq->cryptlen; - - if (err || !rctx->left) - goto out; - - rctx->dst = rctx->dstbuf; - - scatterwalk_done(&w.out, 0, 1); - sg = w.out.sg; - offset = w.out.offset; - - if (rctx->dst != sg) { - rctx->dst[0] = *sg; - sg_unmark_end(rctx->dst); - scatterwalk_crypto_chain(rctx->dst, sg_next(sg), 0, 2); - } - rctx->dst[0].length -= offset - sg->offset; - rctx->dst[0].offset = offset; - -out: return err; } -static int pre_crypt(struct skcipher_request *req) +static void crypt_done(struct crypto_async_request *areq, int err) { + struct skcipher_request *req = areq->data; struct rctx *rctx = skcipher_request_ctx(req); - le128 *buf = rctx->ext ?: rctx->buf; - struct skcipher_request *subreq; - const int bs = XTS_BLOCK_SIZE; - struct skcipher_walk w; - struct scatterlist *sg; - unsigned cryptlen; - unsigned offset; - bool more; - int err; - - subreq = &rctx->subreq; - cryptlen = subreq->cryptlen; - - more = rctx->left > cryptlen; - if (!more) - cryptlen = rctx->left; - - skcipher_request_set_crypt(subreq, rctx->src, rctx->dst, - cryptlen, NULL); - - err = skcipher_walk_virt(&w, subreq, false); - - while (w.nbytes) { - unsigned int avail = w.nbytes; - le128 *wsrc; - le128 *wdst; - - wsrc = w.src.virt.addr; - wdst = w.dst.virt.addr; - - do { - *buf++ = rctx->t; - le128_xor(wdst++, &rctx->t, wsrc++); - gf128mul_x_ble(&rctx->t, &rctx->t); - } while ((avail -= bs) >= bs); - - err = skcipher_walk_done(&w, avail); - } - - skcipher_request_set_crypt(subreq, rctx->dst, rctx->dst, - cryptlen, NULL); + struct skcipher_request *subreq = &rctx->subreq; - if (err || !more) - goto out; + if (!err) + err = xor_tweak(rctx, subreq); - rctx->src = rctx->srcbuf; - - scatterwalk_done(&w.in, 0, 1); - sg = w.in.sg; - offset = w.in.offset; - - if (rctx->src != sg) { - rctx->src[0] = *sg; - sg_unmark_end(rctx->src); - scatterwalk_crypto_chain(rctx->src, sg_next(sg), 0, 2); - } - rctx->src[0].length -= offset - sg->offset; - rctx->src[0].offset = offset; - -out: - return err; + skcipher_request_complete(req, err); } -static int init_crypt(struct skcipher_request *req, crypto_completion_t done) +static void init_crypt(struct skcipher_request *req) { struct priv *ctx = crypto_skcipher_ctx(crypto_skcipher_reqtfm(req)); struct rctx *rctx = skcipher_request_ctx(req); - struct skcipher_request *subreq; - gfp_t gfp; + struct skcipher_request *subreq = &rctx->subreq; - subreq = &rctx->subreq; skcipher_request_set_tfm(subreq, ctx->child); - skcipher_request_set_callback(subreq, req->base.flags, done, req); - - gfp = req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP ? GFP_KERNEL : - GFP_ATOMIC; - rctx->ext = NULL; - - subreq->cryptlen = XTS_BUFFER_SIZE; - if (req->cryptlen > XTS_BUFFER_SIZE) { - unsigned int n = min(req->cryptlen, (unsigned int)PAGE_SIZE); - - rctx->ext = kmalloc(n, gfp); - if (rctx->ext) - subreq->cryptlen = n; - } - - rctx->src = req->src; - rctx->dst = req->dst; - rctx->left = req->cryptlen; + skcipher_request_set_callback(subreq, req->base.flags, crypt_done, req); + skcipher_request_set_crypt(subreq, req->dst, req->dst, + req->cryptlen, NULL); /* calculate first value of T */ crypto_cipher_encrypt_one(ctx->tweak, (u8 *)&rctx->t, req->iv); - - return 0; -} - -static void exit_crypt(struct skcipher_request *req) -{ - struct rctx *rctx = skcipher_request_ctx(req); - - rctx->left = 0; - - if (rctx->ext) - kzfree(rctx->ext); -} - -static int do_encrypt(struct skcipher_request *req, int err) -{ - struct rctx *rctx = skcipher_request_ctx(req); - struct skcipher_request *subreq; - - subreq = &rctx->subreq; - - while (!err && rctx->left) { - err = pre_crypt(req) ?: - crypto_skcipher_encrypt(subreq) ?: - post_crypt(req); - - if (err == -EINPROGRESS || err == -EBUSY) - return err; - } - - exit_crypt(req); - return err; -} - -static void encrypt_done(struct crypto_async_request *areq, int err) -{ - struct skcipher_request *req = areq->data; - struct skcipher_request *subreq; - struct rctx *rctx; - - rctx = skcipher_request_ctx(req); - - if (err == -EINPROGRESS) { - if (rctx->left != req->cryptlen) - return; - goto out; - } - - subreq = &rctx->subreq; - subreq->base.flags &= CRYPTO_TFM_REQ_MAY_BACKLOG; - - err = do_encrypt(req, err ?: post_crypt(req)); - if (rctx->left) - return; - -out: - skcipher_request_complete(req, err); } static int encrypt(struct skcipher_request *req) -{ - return do_encrypt(req, init_crypt(req, encrypt_done)); -} - -static int do_decrypt(struct skcipher_request *req, int err) { struct rctx *rctx = skcipher_request_ctx(req); - struct skcipher_request *subreq; - - subreq = &rctx->subreq; + struct skcipher_request *subreq = &rctx->subreq; - while (!err && rctx->left) { - err = pre_crypt(req) ?: - crypto_skcipher_decrypt(subreq) ?: - post_crypt(req); - - if (err == -EINPROGRESS || err == -EBUSY) - return err; - } - - exit_crypt(req); - return err; -} - -static void decrypt_done(struct crypto_async_request *areq, int err) -{ - struct skcipher_request *req = areq->data; - struct skcipher_request *subreq; - struct rctx *rctx; - - rctx = skcipher_request_ctx(req); - - if (err == -EINPROGRESS) { - if (rctx->left != req->cryptlen) - return; - goto out; - } - - subreq = &rctx->subreq; - subreq->base.flags &= CRYPTO_TFM_REQ_MAY_BACKLOG; - - err = do_decrypt(req, err ?: post_crypt(req)); - if (rctx->left) - return; - -out: - skcipher_request_complete(req, err); + init_crypt(req); + return xor_tweak(rctx, req) ?: + crypto_skcipher_encrypt(subreq) ?: + xor_tweak(rctx, subreq); } static int decrypt(struct skcipher_request *req) { - return do_decrypt(req, init_crypt(req, decrypt_done)); + struct rctx *rctx = skcipher_request_ctx(req); + struct skcipher_request *subreq = &rctx->subreq; + + init_crypt(req); + return xor_tweak(rctx, req) ?: + crypto_skcipher_decrypt(subreq) ?: + xor_tweak(rctx, subreq); } static int init_tfm(struct crypto_skcipher *tfm)