From patchwork Mon Nov 14 21:01:13 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Alex Cope <alexcope@google.com>
X-Patchwork-Id: 9428395
X-Patchwork-Delegate: herbert@gondor.apana.org.au
Return-Path: <linux-crypto-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	9D5166047D for <patchwork-linux-crypto@patchwork.kernel.org>;
	Mon, 14 Nov 2016 21:01:47 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9A96B289F8
	for <patchwork-linux-crypto@patchwork.kernel.org>;
	Mon, 14 Nov 2016 21:01:47 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 8FBDD28AC3; Mon, 14 Nov 2016 21:01:47 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,
	DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, RCVD_IN_DNSWL_HI,
	T_DKIM_INVALID autolearn=ham version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1142B289F8
	for <patchwork-linux-crypto@patchwork.kernel.org>;
	Mon, 14 Nov 2016 21:01:47 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S934199AbcKNVBq (ORCPT
	<rfc822;patchwork-linux-crypto@patchwork.kernel.org>);
	Mon, 14 Nov 2016 16:01:46 -0500
Received: from mail-pg0-f49.google.com ([74.125.83.49]:36676 "EHLO
	mail-pg0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1753274AbcKNVBp (ORCPT
	<rfc822;linux-crypto@vger.kernel.org>);
	Mon, 14 Nov 2016 16:01:45 -0500
Received: by mail-pg0-f49.google.com with SMTP id f188so57852213pgc.3
	for <linux-crypto@vger.kernel.org>;
	Mon, 14 Nov 2016 13:01:44 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=google.com; s=20120113;
	h=from:to:cc:subject:date:message-id:in-reply-to:references;
	bh=qk2xn2lLD4XVMkcMili91eyDkuxeUbVAJqvm5Xjkssg=;
	b=IeROCmXCKaI4MJDcjJlSZ3U/dWe4CPld2q+PiEaJSZmPwVBPFeadZ4fTiJeMpAfGjc
	cC4QmX0NSn8Dlflf541WucI+w+204HaHQLYjCS6vQ0S/2hM0D4DPuplo55ysIXGu5/EB
	ZPrV8MssuEzhWlG7mhj9PUhdJryZme7Xbql6YbguFYFG72+vkzar1MztqKen7JdeVd48
	wO+qdT6foMmTxMatEZG4UkUv1R1t3E+LxzhiI+amr7Cvd30B3PGwsyDxY3k0XV9JOpo+
	mA9FqNMIGafnzZYwo1IzwLBJnu1Ifaj/rRPwuuI2NrPc1jW3y5XteTjVV+N21kGHIksb
	uX5g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20130820;
	h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
	:references;
	bh=qk2xn2lLD4XVMkcMili91eyDkuxeUbVAJqvm5Xjkssg=;
	b=flf/PnRBxnD7QyqeTMcv0B6foiYa3zSqi9ZIKd4RxLn74lCrcnmYTkS4W2wqGj0pIA
	IX7vE+RCq61dgB1UGYYHtnmzxTXlcmbuZhUt030/4I8xURIv/Q8PwhjnUNXLw8klB8uv
	Lh1dCii6gVC86gzywCtHdapEBHRG0qxNzIJqARTdiYI/mu2Qab3DU+TxDzq/UhGxKMFw
	QhUZ5v2f8AJfhHcxQZQlLlN6DBYCITq69TyhjsVcSWWiS02O2geg/25JhxJgdALb8RPP
	GVx4sny7GaaoWkVU/tHD7jLyScAuS/Q2isJli9AIQjAnNqnDtsuSGW09yW4M+pFQ4s3P
	X8YQ==
X-Gm-Message-State: 
 ABUngveW+MQ3RW2d9O9zhyerB2nf+WZx7NQHSXGZixzW4JanKPIOEwlgAcM2XLdLyOFYB2Vw
X-Received: by 10.99.137.66 with SMTP id v63mr72172552pgd.117.1479157304123;
	Mon, 14 Nov 2016 13:01:44 -0800 (PST)
Received: from alexcope-linuxxworkstation.kir.corp.google.com
	([172.23.164.32]) by smtp.gmail.com with ESMTPSA id
	p25sm37314339pfk.20.2016.11.14.13.01.41
	(version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128);
	Mon, 14 Nov 2016 13:01:43 -0800 (PST)
From: Alex Cope <alexcope@google.com>
To: linux-crypto@vger.kernel.org
Cc: mhalcrow@google.com, edknapp@google.com, Alex Cope <alexcope@google.com>,
	Eric Biggers <ebiggers@google.com>
Subject: [RFC][PATCH 2/7] crypto: gf128mul - Refactor gf128 overflow macros
Date: Mon, 14 Nov 2016 13:01:13 -0800
Message-Id: <1479157277-10251-3-git-send-email-alexcope@google.com>
X-Mailer: git-send-email 2.8.0.rc3.226.g39d4020
In-Reply-To: <1479157277-10251-1-git-send-email-alexcope@google.com>
References: <1479157277-10251-1-git-send-email-alexcope@google.com>
Sender: linux-crypto-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-crypto.vger.kernel.org>
X-Mailing-List: linux-crypto@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

Rename and clean up the overflow macros. Their usage is more general
than the name suggested.

Signed-off-by: Alex Cope <alexcope@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 crypto/gf128mul.c | 68 +++++++++++++++++++++++++++++++++----------------------
 1 file changed, 41 insertions(+), 27 deletions(-)

diff --git a/crypto/gf128mul.c b/crypto/gf128mul.c
index 0594dd6..8b65b1e 100644
--- a/crypto/gf128mul.c
+++ b/crypto/gf128mul.c
@@ -88,33 +88,47 @@
 	q(0xf8), q(0xf9), q(0xfa), q(0xfb), q(0xfc), q(0xfd), q(0xfe), q(0xff) \
 }
 
-/*	Given the value i in 0..255 as the byte overflow when a field element
-    in GHASH is multiplied by x^8, this function will return the values that
-    are generated in the lo 16-bit word of the field value by applying the
-    modular polynomial. The values lo_byte and hi_byte are returned via the
-    macro xp_fun(lo_byte, hi_byte) so that the values can be assembled into
-    memory as required by a suitable definition of this macro operating on
-    the table above
-*/
-
-#define xx(p, q)	0x##p##q
+/*
+ * Given a value i in 0..255 as the byte overflow when a field element
+ * in GF(2^128) is multiplied by x^8, the following macro returns the
+ * 16-bit value that must be XOR-ed into the low-degree end of the
+ * product to reduce it modulo the irreducible polynomial x^128 + x^7 +
+ * x^2 + x + 1.
+ *
+ * There are two versions of the macro, and hence two tables: one for
+ * the "be" convention where the highest-order bit is the coefficient of
+ * the highest-degree polynomial term, and one for the "le" convention
+ * where the highest-order bit is the coefficient of the lowest-degree
+ * polynomial term.  In both cases the values are stored in CPU byte
+ * endianness such that the coefficients are ordered consistently across
+ * bytes, i.e. in the "be" table bits 15..0 of the stored value
+ * correspond to the coefficients of x^15..x^0, and in the "le" table
+ * bits 15..0 correspond to the coefficients of x^0..x^15.
+ *
+ * Therefore, provided that the appropriate byte endianness conversions
+ * are done by the multiplication functions (and these must be in place
+ * anyway to support both little endian and big endian CPUs), the "be"
+ * table can be used for multiplications of both "bbe" and "ble"
+ * elements, and the "le" table can be used for multiplications of both
+ * "lle" and "lbe" elements.
+ */
 
-#define xda_bbe(i) ( \
-	(i & 0x80 ? xx(43, 80) : 0) ^ (i & 0x40 ? xx(21, c0) : 0) ^ \
-	(i & 0x20 ? xx(10, e0) : 0) ^ (i & 0x10 ? xx(08, 70) : 0) ^ \
-	(i & 0x08 ? xx(04, 38) : 0) ^ (i & 0x04 ? xx(02, 1c) : 0) ^ \
-	(i & 0x02 ? xx(01, 0e) : 0) ^ (i & 0x01 ? xx(00, 87) : 0) \
+#define xda_be(i) ( \
+	(i & 0x80 ? 0x4380 : 0) ^ (i & 0x40 ? 0x21c0 : 0) ^ \
+	(i & 0x20 ? 0x10e0 : 0) ^ (i & 0x10 ? 0x0870 : 0) ^ \
+	(i & 0x08 ? 0x0438 : 0) ^ (i & 0x04 ? 0x021c : 0) ^ \
+	(i & 0x02 ? 0x010e : 0) ^ (i & 0x01 ? 0x0087 : 0) \
 )
 
-#define xda_lle(i) ( \
-	(i & 0x80 ? xx(e1, 00) : 0) ^ (i & 0x40 ? xx(70, 80) : 0) ^ \
-	(i & 0x20 ? xx(38, 40) : 0) ^ (i & 0x10 ? xx(1c, 20) : 0) ^ \
-	(i & 0x08 ? xx(0e, 10) : 0) ^ (i & 0x04 ? xx(07, 08) : 0) ^ \
-	(i & 0x02 ? xx(03, 84) : 0) ^ (i & 0x01 ? xx(01, c2) : 0) \
+#define xda_le(i) ( \
+	(i & 0x80 ? 0xe100 : 0) ^ (i & 0x40 ? 0x7080 : 0) ^ \
+	(i & 0x20 ? 0x3840 : 0) ^ (i & 0x10 ? 0x1c20 : 0) ^ \
+	(i & 0x08 ? 0x0e10 : 0) ^ (i & 0x04 ? 0x0708 : 0) ^ \
+	(i & 0x02 ? 0x0384 : 0) ^ (i & 0x01 ? 0x01c2 : 0) \
 )
 
-static const u16 gf128mul_table_lle[256] = gf128mul_dat(xda_lle);
-static const u16 gf128mul_table_bbe[256] = gf128mul_dat(xda_bbe);
+static const u16 gf128mul_table_le[256] = gf128mul_dat(xda_le);
+static const u16 gf128mul_table_be[256] = gf128mul_dat(xda_be);
 
 /* These functions multiply a field element by x, by x^4 and by x^8
  * in the polynomial field representation. It uses 32-bit word operations
@@ -126,7 +140,7 @@ static void gf128mul_x_lle(be128 *r, const be128 *x)
 {
 	u64 a = be64_to_cpu(x->a);
 	u64 b = be64_to_cpu(x->b);
-	u64 _tt = gf128mul_table_lle[(b << 7) & 0xff];
+	u64 _tt = gf128mul_table_le[(b << 7) & 0xff];
 
 	r->b = cpu_to_be64((b >> 1) | (a << 63));
 	r->a = cpu_to_be64((a >> 1) ^ (_tt << 48));
@@ -136,7 +150,7 @@ static void gf128mul_x_bbe(be128 *r, const be128 *x)
 {
 	u64 a = be64_to_cpu(x->a);
 	u64 b = be64_to_cpu(x->b);
-	u64 _tt = gf128mul_table_bbe[a >> 63];
+	u64 _tt = gf128mul_table_be[a >> 63];
 
 	r->a = cpu_to_be64((a << 1) | (b >> 63));
 	r->b = cpu_to_be64((b << 1) ^ _tt);
@@ -146,7 +160,7 @@ void gf128mul_x_ble(be128 *r, const be128 *x)
 {
 	u64 a = le64_to_cpu(x->a);
 	u64 b = le64_to_cpu(x->b);
-	u64 _tt = gf128mul_table_bbe[b >> 63];
+	u64 _tt = gf128mul_table_be[b >> 63];
 
 	r->a = cpu_to_le64((a << 1) ^ _tt);
 	r->b = cpu_to_le64((b << 1) | (a >> 63));
@@ -157,7 +171,7 @@ static void gf128mul_x8_lle(be128 *x)
 {
 	u64 a = be64_to_cpu(x->a);
 	u64 b = be64_to_cpu(x->b);
-	u64 _tt = gf128mul_table_lle[b & 0xff];
+	u64 _tt = gf128mul_table_le[b & 0xff];
 
 	x->b = cpu_to_be64((b >> 8) | (a << 56));
 	x->a = cpu_to_be64((a >> 8) ^ (_tt << 48));
@@ -167,7 +181,7 @@ static void gf128mul_x8_bbe(be128 *x)
 {
 	u64 a = be64_to_cpu(x->a);
 	u64 b = be64_to_cpu(x->b);
-	u64 _tt = gf128mul_table_bbe[a >> 56];
+	u64 _tt = gf128mul_table_be[a >> 56];
 
 	x->a = cpu_to_be64((a << 8) | (b >> 56));
 	x->b = cpu_to_be64((b << 8) ^ _tt);