diff mbox series

[v2,1/6] crypto: Use memzero_explicit() for clearing state

Message ID 20201020203957.3512851-2-nivedita@alum.mit.edu (mailing list archive)
State Superseded
Delegated to: Herbert Xu
Headers show
Series crypto: lib/sha256 - cleanup/optimization | expand

Commit Message

Arvind Sankar Oct. 20, 2020, 8:39 p.m. UTC
Without the barrier_data() inside memzero_explicit(), the compiler may
optimize away the state-clearing if it can tell that the state is not
used afterwards. At least in lib/crypto/sha256.c:__sha256_final(), the
function can get inlined into sha256(), in which case the memset is
optimized away.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
---
 include/crypto/sha1_base.h   | 3 ++-
 include/crypto/sha256_base.h | 3 ++-
 include/crypto/sha512_base.h | 3 ++-
 include/crypto/sm3_base.h    | 3 ++-
 lib/crypto/sha256.c          | 2 +-
 5 files changed, 9 insertions(+), 5 deletions(-)

Comments

Eric Biggers Oct. 22, 2020, 4:36 a.m. UTC | #1
On Tue, Oct 20, 2020 at 04:39:52PM -0400, Arvind Sankar wrote:
> Without the barrier_data() inside memzero_explicit(), the compiler may
> optimize away the state-clearing if it can tell that the state is not
> used afterwards. At least in lib/crypto/sha256.c:__sha256_final(), the
> function can get inlined into sha256(), in which case the memset is
> optimized away.
> 
> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>

Reviewed-by: Eric Biggers <ebiggers@google.com>

Maybe get the one in arch/arm64/crypto/sha3-ce-glue.c too?

- Eric
Arvind Sankar Oct. 23, 2020, 3:39 p.m. UTC | #2
On Wed, Oct 21, 2020 at 09:36:33PM -0700, Eric Biggers wrote:
> On Tue, Oct 20, 2020 at 04:39:52PM -0400, Arvind Sankar wrote:
> > Without the barrier_data() inside memzero_explicit(), the compiler may
> > optimize away the state-clearing if it can tell that the state is not
> > used afterwards. At least in lib/crypto/sha256.c:__sha256_final(), the
> > function can get inlined into sha256(), in which case the memset is
> > optimized away.
> > 
> > Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
> 
> Reviewed-by: Eric Biggers <ebiggers@google.com>
> 
> Maybe get the one in arch/arm64/crypto/sha3-ce-glue.c too?
> 
> - Eric

Hm, there are a few more as well like that. But now I'm thinking it's
only the generic sha256.c that may be problematic. The rest of them are
in _final() functions which will be stored as function pointers in a
structure, so there should be no risk of them getting optimized away?
Eric Biggers Oct. 23, 2020, 3:56 p.m. UTC | #3
On Fri, Oct 23, 2020 at 11:39:27AM -0400, Arvind Sankar wrote:
> On Wed, Oct 21, 2020 at 09:36:33PM -0700, Eric Biggers wrote:
> > On Tue, Oct 20, 2020 at 04:39:52PM -0400, Arvind Sankar wrote:
> > > Without the barrier_data() inside memzero_explicit(), the compiler may
> > > optimize away the state-clearing if it can tell that the state is not
> > > used afterwards. At least in lib/crypto/sha256.c:__sha256_final(), the
> > > function can get inlined into sha256(), in which case the memset is
> > > optimized away.
> > > 
> > > Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
> > 
> > Reviewed-by: Eric Biggers <ebiggers@google.com>
> > 
> > Maybe get the one in arch/arm64/crypto/sha3-ce-glue.c too?
> > 
> > - Eric
> 
> Hm, there are a few more as well like that. But now I'm thinking it's
> only the generic sha256.c that may be problematic. The rest of them are
> in _final() functions which will be stored as function pointers in a
> structure, so there should be no risk of them getting optimized away?

When clearing memory because "it may be sensitive" rather than "it's needed for
the code to behave correctly", I think it's best to use memzero_explicit() to
make the intent clear, even if it seems that memset() is sufficient.  Also keep
in mind that support for compiling the kernel with LTO (link-time optimization)
is being worked on (and some people already do it), which results in more code
being optimized out.

- Eric
Herbert Xu Oct. 23, 2020, 8:45 p.m. UTC | #4
On Fri, Oct 23, 2020 at 08:56:04AM -0700, Eric Biggers wrote:
>
> When clearing memory because "it may be sensitive" rather than "it's needed for
> the code to behave correctly", I think it's best to use memzero_explicit() to
> make the intent clear, even if it seems that memset() is sufficient.  Also keep
> in mind that support for compiling the kernel with LTO (link-time optimization)
> is being worked on (and some people already do it), which results in more code
> being optimized out.

The rule up until now has been that we only use memzero_explicit for
stack variables.  At this point please don't convert anything else
as it will cause too much churn.

If LTO did arrive we should do a global conversion.

Cheers,
Eric Biggers Oct. 23, 2020, 9:53 p.m. UTC | #5
On Sat, Oct 24, 2020 at 07:45:36AM +1100, Herbert Xu wrote:
> On Fri, Oct 23, 2020 at 08:56:04AM -0700, Eric Biggers wrote:
> >
> > When clearing memory because "it may be sensitive" rather than "it's needed for
> > the code to behave correctly", I think it's best to use memzero_explicit() to
> > make the intent clear, even if it seems that memset() is sufficient.  Also keep
> > in mind that support for compiling the kernel with LTO (link-time optimization)
> > is being worked on (and some people already do it), which results in more code
> > being optimized out.
> 
> The rule up until now has been that we only use memzero_explicit for
> stack variables.  At this point please don't convert anything else
> as it will cause too much churn.
> 
> If LTO did arrive we should do a global conversion.
> 

LTO is actively being worked on, now up to v6:
https://lkml.kernel.org/lkml/20201013003203.4168817-1-samitolvanen@google.com/
And in the real world it's already being used; the Android Compatibility
Definition Document strongly recommends enabling CFI, which depends on LTO.

It's doubtful that anyone will do a global conversion from memset() to
memzero_explicit(), as it's too hard to find all the places that should be
converted.  They are in lots of different subsystems; the crypto subsystem will
have the most, but not all.  We just need to fix as many as we can.  If you'd
like to do something more comprehensive than this patch, that would be great,
but I hope we don't wait forever for a global conversion that never happens.

FWIW, kfree_sensitive() (formerly kzfree) already got converted by
https://git.kernel.org/linus/8982ae527fbef170, and it wasn't really
controversial.  Some people even wanted Cc stable (which I disagreed with, but
it apparently made the final version).

- Eric
Herbert Xu Oct. 29, 2020, 7 a.m. UTC | #6
On Fri, Oct 23, 2020 at 02:53:29PM -0700, Eric Biggers wrote:
>
> It's doubtful that anyone will do a global conversion from memset() to
> memzero_explicit(), as it's too hard to find all the places that should be
> converted.  They are in lots of different subsystems; the crypto subsystem will
> have the most, but not all.  We just need to fix as many as we can.  If you'd
> like to do something more comprehensive than this patch, that would be great,
> but I hope we don't wait forever for a global conversion that never happens.

I had thought that there were more instances of memset in crypto/
that needed converting, but a quick grep seems to indicate that most
are not sensitive.  So I'm OK with this patch.

Thanks,
diff mbox series

Patch

diff --git a/include/crypto/sha1_base.h b/include/crypto/sha1_base.h
index 20fd1f7468af..a5d6033efef7 100644
--- a/include/crypto/sha1_base.h
+++ b/include/crypto/sha1_base.h
@@ -12,6 +12,7 @@ 
 #include <crypto/sha.h>
 #include <linux/crypto.h>
 #include <linux/module.h>
+#include <linux/string.h>
 
 #include <asm/unaligned.h>
 
@@ -101,7 +102,7 @@  static inline int sha1_base_finish(struct shash_desc *desc, u8 *out)
 	for (i = 0; i < SHA1_DIGEST_SIZE / sizeof(__be32); i++)
 		put_unaligned_be32(sctx->state[i], digest++);
 
-	*sctx = (struct sha1_state){};
+	memzero_explicit(sctx, sizeof(*sctx));
 	return 0;
 }
 
diff --git a/include/crypto/sha256_base.h b/include/crypto/sha256_base.h
index 6ded110783ae..93f9fd21cc06 100644
--- a/include/crypto/sha256_base.h
+++ b/include/crypto/sha256_base.h
@@ -12,6 +12,7 @@ 
 #include <crypto/sha.h>
 #include <linux/crypto.h>
 #include <linux/module.h>
+#include <linux/string.h>
 
 #include <asm/unaligned.h>
 
@@ -105,7 +106,7 @@  static inline int sha256_base_finish(struct shash_desc *desc, u8 *out)
 	for (i = 0; digest_size > 0; i++, digest_size -= sizeof(__be32))
 		put_unaligned_be32(sctx->state[i], digest++);
 
-	*sctx = (struct sha256_state){};
+	memzero_explicit(sctx, sizeof(*sctx));
 	return 0;
 }
 
diff --git a/include/crypto/sha512_base.h b/include/crypto/sha512_base.h
index fb19c77494dc..93ab73baa38e 100644
--- a/include/crypto/sha512_base.h
+++ b/include/crypto/sha512_base.h
@@ -12,6 +12,7 @@ 
 #include <crypto/sha.h>
 #include <linux/crypto.h>
 #include <linux/module.h>
+#include <linux/string.h>
 
 #include <asm/unaligned.h>
 
@@ -126,7 +127,7 @@  static inline int sha512_base_finish(struct shash_desc *desc, u8 *out)
 	for (i = 0; digest_size > 0; i++, digest_size -= sizeof(__be64))
 		put_unaligned_be64(sctx->state[i], digest++);
 
-	*sctx = (struct sha512_state){};
+	memzero_explicit(sctx, sizeof(*sctx));
 	return 0;
 }
 
diff --git a/include/crypto/sm3_base.h b/include/crypto/sm3_base.h
index 1cbf9aa1fe52..2f3a32ab97bb 100644
--- a/include/crypto/sm3_base.h
+++ b/include/crypto/sm3_base.h
@@ -13,6 +13,7 @@ 
 #include <crypto/sm3.h>
 #include <linux/crypto.h>
 #include <linux/module.h>
+#include <linux/string.h>
 #include <asm/unaligned.h>
 
 typedef void (sm3_block_fn)(struct sm3_state *sst, u8 const *src, int blocks);
@@ -104,7 +105,7 @@  static inline int sm3_base_finish(struct shash_desc *desc, u8 *out)
 	for (i = 0; i < SM3_DIGEST_SIZE / sizeof(__be32); i++)
 		put_unaligned_be32(sctx->state[i], digest++);
 
-	*sctx = (struct sm3_state){};
+	memzero_explicit(sctx, sizeof(*sctx));
 	return 0;
 }
 
diff --git a/lib/crypto/sha256.c b/lib/crypto/sha256.c
index 2321f6cb322f..d43bc39ab05e 100644
--- a/lib/crypto/sha256.c
+++ b/lib/crypto/sha256.c
@@ -265,7 +265,7 @@  static void __sha256_final(struct sha256_state *sctx, u8 *out, int digest_words)
 		put_unaligned_be32(sctx->state[i], &dst[i]);
 
 	/* Zeroize sensitive information. */
-	memset(sctx, 0, sizeof(*sctx));
+	memzero_explicit(sctx, sizeof(*sctx));
 }
 
 void sha256_final(struct sha256_state *sctx, u8 *out)