fscrypt: lock mutex before checking for bounce page pool
diff mbox

Message ID 20170706175748.33093-1-ebiggers3@gmail.com
State Accepted
Headers show

Commit Message

Eric Biggers July 6, 2017, 5:57 p.m. UTC
From: Eric Biggers <ebiggers@google.com>

fscrypt_initialize(), which allocates the global bounce page pool when
an encrypted file is first accessed, uses "double-checked locking" to
try to avoid locking fscrypt_init_mutex.  However, it doesn't use any
memory barriers, so it's theoretically possible for a thread to observe
a bounce page pool which has not been fully initialized.  This is a
classic bug with "double-checked locking".

While "only a theoretical issue" in the latest kernel, in pre-4.8
kernels the pointer that was checked was not even the last to be
initialized, so it was easily possible for a crash (NULL pointer
dereference) to happen.  This was changed only incidentally by the large
refactor to use fs/crypto/.

Solve both problems in a trivial way that can easily be backported: just
always take the mutex.  It's theoretically less efficient, but it
shouldn't be noticeable in practice as the mutex is only acquired very
briefly once per encrypted file.

Later I'd like to make this use a helper macro like DO_ONCE().  However,
DO_ONCE() runs in atomic context, so we'd need to add a new macro that
allows blocking.

Cc: stable@vger.kernel.org # v4.1+
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/crypto/crypto.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

Comments

Eric Biggers Sept. 13, 2017, 5:22 a.m. UTC | #1
On Thu, Jul 06, 2017 at 10:57:48AM -0700, Eric Biggers wrote:
> fscrypt_initialize(), which allocates the global bounce page pool when
> an encrypted file is first accessed, uses "double-checked locking" to
> try to avoid locking fscrypt_init_mutex.  However, it doesn't use any
> memory barriers, so it's theoretically possible for a thread to observe
> a bounce page pool which has not been fully initialized.  This is a
> classic bug with "double-checked locking".
> 
> While "only a theoretical issue" in the latest kernel, in pre-4.8
> kernels the pointer that was checked was not even the last to be
> initialized, so it was easily possible for a crash (NULL pointer
> dereference) to happen.  This was changed only incidentally by the large
> refactor to use fs/crypto/.
> 
> Solve both problems in a trivial way that can easily be backported: just
> always take the mutex.  It's theoretically less efficient, but it
> shouldn't be noticeable in practice as the mutex is only acquired very
> briefly once per encrypted file.
> 

Ted, can you take this patch?  On Android this bug has been causing a NULL
pointer dereference in ext4_get_encryption_info on boot.  Granted, due to the
way the code has been moved around it no longer would happen in practice in the
latest kernel, but we still need something to backport to 4.4, etc.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-fscrypt" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Biggers Oct. 9, 2017, 8:53 p.m. UTC | #2
On Tue, Sep 12, 2017 at 10:22:52PM -0700, Eric Biggers wrote:
> On Thu, Jul 06, 2017 at 10:57:48AM -0700, Eric Biggers wrote:
> > fscrypt_initialize(), which allocates the global bounce page pool when
> > an encrypted file is first accessed, uses "double-checked locking" to
> > try to avoid locking fscrypt_init_mutex.  However, it doesn't use any
> > memory barriers, so it's theoretically possible for a thread to observe
> > a bounce page pool which has not been fully initialized.  This is a
> > classic bug with "double-checked locking".
> > 
> > While "only a theoretical issue" in the latest kernel, in pre-4.8
> > kernels the pointer that was checked was not even the last to be
> > initialized, so it was easily possible for a crash (NULL pointer
> > dereference) to happen.  This was changed only incidentally by the large
> > refactor to use fs/crypto/.
> > 
> > Solve both problems in a trivial way that can easily be backported: just
> > always take the mutex.  It's theoretically less efficient, but it
> > shouldn't be noticeable in practice as the mutex is only acquired very
> > briefly once per encrypted file.
> > 
> 
> Ted, can you take this patch?  On Android this bug has been causing a NULL
> pointer dereference in ext4_get_encryption_info on boot.  Granted, due to the
> way the code has been moved around it no longer would happen in practice in the
> latest kernel, but we still need something to backport to 4.4, etc.
> 
> Eric

Ping.  Ted, can you take this through the fscrypt tree?  Or should I sent a
similar patch just for 4.4-stable (and earlier), then do something fancier with
smp_store_release, smp_load_acquire, etc. for the latest version?  Personally
I'd prefer starting with the trivial fix, as it can be optimized later.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-fscrypt" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Y. Ts'o Oct. 29, 2017, 10:25 a.m. UTC | #3
On Thu, Jul 06, 2017 at 10:57:48AM -0700, Eric Biggers wrote:
> From: Eric Biggers <ebiggers@google.com>
> 
> fscrypt_initialize(), which allocates the global bounce page pool when
> an encrypted file is first accessed, uses "double-checked locking" to
> try to avoid locking fscrypt_init_mutex.  However, it doesn't use any
> memory barriers, so it's theoretically possible for a thread to observe
> a bounce page pool which has not been fully initialized.  This is a
> classic bug with "double-checked locking".
> 
> While "only a theoretical issue" in the latest kernel, in pre-4.8
> kernels the pointer that was checked was not even the last to be
> initialized, so it was easily possible for a crash (NULL pointer
> dereference) to happen.  This was changed only incidentally by the large
> refactor to use fs/crypto/.
> 
> Solve both problems in a trivial way that can easily be backported: just
> always take the mutex.  It's theoretically less efficient, but it
> shouldn't be noticeable in practice as the mutex is only acquired very
> briefly once per encrypted file.
> 
> Later I'd like to make this use a helper macro like DO_ONCE().  However,
> DO_ONCE() runs in atomic context, so we'd need to add a new macro that
> allows blocking.
> 
> Cc: stable@vger.kernel.org # v4.1+
> Signed-off-by: Eric Biggers <ebiggers@google.com>

Applied, thanks.  Sorry for the delay; this slipped through the
cracks, and then I had a crazy travel/conference schedule.

		    	    	       	 - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-fscrypt" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch
diff mbox

diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
index c7835df7e7b8..d262a93d9b31 100644
--- a/fs/crypto/crypto.c
+++ b/fs/crypto/crypto.c
@@ -410,11 +410,8 @@  int fscrypt_initialize(unsigned int cop_flags)
 {
 	int i, res = -ENOMEM;
 
-	/*
-	 * No need to allocate a bounce page pool if there already is one or
-	 * this FS won't use it.
-	 */
-	if (cop_flags & FS_CFLG_OWN_PAGES || fscrypt_bounce_page_pool)
+	/* No need to allocate a bounce page pool if this FS won't use it. */
+	if (cop_flags & FS_CFLG_OWN_PAGES)
 		return 0;
 
 	mutex_lock(&fscrypt_init_mutex);