fscrypt: use 32 bytes of encrypted filename

Message ID	20170418210642.6039-1-gwendal@chromium.org (mailing list archive)
State	Superseded
Headers	show Return-Path: <linux-fscrypt-owner@kernel.org> From: Gwendal Grignou <gwendal@chromium.org> To: tytso@mit.edu, ebiggers@google.com Cc: linux-ext4@vger.kernel.org, linux-fscrypt@vger.kernel.org, kinaba@chromium.org, hashimoto@chromium.org Subject: [PATCH] fscrypt: use 32 bytes of encrypted filename Date: Tue, 18 Apr 2017 14:06:42 -0700 Message-Id: <20170418210642.6039-1-gwendal@chromium.org> Sender: linux-fscrypt-owner@vger.kernel.org Precedence: bulk

Gwendal Grignou April 18, 2017, 9:06 p.m. UTC

If we use only 16 bytes, due to how CBC works, if the names have the
same beginning, their last ciphertext block (16 bytes) may be identical.

It happens when two file names share the first 16k bytes and both have
length withn 16 * n + 13 and 16 * n + 16.

Instead use 32 bytes to build the filenames from encrypted data when
directory is scrambled.

The drawback is the scrambled filenames change after applying the patch.
Consider an encrypted directory with:

ls -il
total 8
1177380 -rw-r--r--. 1 root root 0 Apr 18 12:10
system@framework@boot-telephony-common.art.crc
1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10
system@framework@boot-telephony-common.oat.crc

Once the key is invalidated, without the patch, ls -li produces:
1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10
_a1Psh01n8FdhC8s9pUywlAyFzlz7n6C3
1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10
_wJS,0akq14ehC8s9pUywlAyFzlz7n6C3

Both files show with the same inode.

After the patch, the names are different, but the inode information is
now correct:
ls -li
1177380 -rw-r--r--. 1 root root 0 Apr 18 12:10
_a1Psh01n8FtxbeglW8BqhuthSUxMqh6cFKwz2nSJDXCIXMXOvfqLcD
1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10
_wJS,0akq14eJcuQks7f2Vsg,zE0Jdz98FKwz2nSJDXCIXMXOvfqLcD

Tested only on ext4.

Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
---
Script to reproduce the error:

BASE_DIR="~/"
DIR="${BASE_DIR}/tmp"
# Create directory.
mkdir -p "${DIR}"
echo foobar | e4crypt add_key "${DIR}"
# Fill directory.
cd "${DIR}"
touch system@framework@boot-telephony-common.oat.crc
touch system@framework@boot-telephony-common.art.crc
cd ..
# Check files have different inode.
ls -il "${DIR}"
# Invalidate key
KEY="$(keyctl show | grep $(e4crypt get_policy "${DIR}" | cut -d ':' -f 2) | sed -ne 's/\(.*\) --al.*/\1/p')"
sync
keyctl invalidate "${KEY}"
echo 3 > /proc/sys/vm/drop_caches
# Once the key is invalidated, both files have the same inode:
ls -il "${DIR}"
if [ $(ls -i1 "${DIR}" | cut -d ' ' -f 1 | uniq | wc -l) -eq 1 ] ; then
  echo same inode!
fi
# if we try to remove the directory, we will get an error
# : Structure needs cleaning
# rm -rf "${DIR}"

 fs/crypto/fname.c | 20 +++++++++++++-------
 fs/ext4/namei.c   |  4 ++--
 2 files changed, 15 insertions(+), 9 deletions(-)

Eric Biggers April 18, 2017, 11:01 p.m. UTC | #1

+Cc linux-f2fs-devel@lists.sourceforge.net
+Cc linux-mtd@lists.infradead.org (for ubifs)

Hi Gwendal,

On Tue, Apr 18, 2017 at 02:06:42PM -0700, Gwendal Grignou wrote:
> If we use only 16 bytes, due to how CBC works, if the names have the
> same beginning, their last ciphertext block (16 bytes) may be identical.
> 
> It happens when two file names share the first 16k bytes and both have
> length withn 16 * n + 13 and 16 * n + 16.
> 
> Instead use 32 bytes to build the filenames from encrypted data when
> directory is scrambled.

Just some background for people who may be unfamiliar with what's going on here
(and it may be useful to include some of this in the patch description):

When accessing files without access to the key, userspace needs to operate on a
filename derived from the ciphertext filename, which contains arbitrary bytes. 

But since it's supported to use filenames up to FILENAME_MAX (255 bytes) in
length when using encryption, we can't always base-64 encode the filename, since
that may make it too long.

The way this is solved currently is that for filenames with ciphertext length
greater than 32 bytes, the filesystem provides an 8-byte "cookie" (split into
'hash' and 'minor_hash'), which along with the last 16 bytes of the filename
ciphertext is base-64 encoded into a fixed-length name.  The filesystem returns
this on readdir.  Then, when a lookup is done, the filesystem translates this
info back into a specific directory entry.

Since ext4 directory entries do not contain a hash field, ext4 relies only on
the 16 bytes of ciphertext to distinguish collisions within a directory block.
Unfortunately, this is broken because with the encryption mode used for
filenames (CTS), the ciphertext of the last 16-byte block depends only on the
plaintext up to and including the *second to last* block, not up to the last
block.  This causes long filenames that differ just near the end to collide.

We could fix this by using the second to last block of ciphertext rather than
the last one.  However, using the last *two* blocks as you're proposing should
be fine too.

Of course we could also hash the filename's ciphertext with SHA-256 or
something, but it's nice to take advantage of the encryption mode, and not have
to do yet another hash.

I am not too worried about changing the way encrypted filenames are presented,
since applications are not supposed to rely on this.  (Though we probably should
be doing something to catch broken applications, like encoding the filenames
slightly differently after each reboot...)

Strangely, f2fs and ubifs do not use the bytes from the filename at all when
trying to find a specific directory entry in this case.  So this patch doesn't
really affect them.  This seems unreliable; perhaps we should introduce a
function like "fscrypt_name_matches()" which all the filesystems could call?
Can any of the f2fs and ubifs developers explain why they don't look at any
bytes from the filename?

Anyway, a couple nits on this patch:

> +	oname->len = 1 + digest_encode(
> +			buf,
> +			FS_FNAME_CRYPTO_DIGEST_SIZE + sizeof(u64),
> +			oname->name + 1);
>  	return 0;
>  }

Use 'sizeof(buf)'

>  
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index c4a389a6027b..14b2a2335a32 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -1257,8 +1257,8 @@ static inline int ext4_match(struct ext4_filename *fname,
>  			int ret;

>  			if (de->name_len < 16)
>  				return 0;

de->name_len < 32

(or replace 32 with FS_FNAME_CRYPTO_DIGEST_SIZE, here and below)

- Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-fscrypt" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Andreas Dilger April 18, 2017, 11:37 p.m. UTC | #2

On Apr 18, 2017, at 3:06 PM, Gwendal Grignou <gwendal@chromium.org> wrote:
> Subject: [PATCH] fscrypt: use 32 bytes of encrypted filename

> If we use only 16 bytes, due to how CBC works, if the names have the
> same beginning, their last ciphertext block (16 bytes) may be identical.

What is missing from the patch summary is: use the encrypted filename for
_what_?

> 
> It happens when two file names share the first 16k bytes and both have

"16k bytes" for the file names?  Presumably you don't mean "16KB", but
rather "16n" or "multiple of 16 bytes" would be more clear.

> length withn 16 * n + 13 and 16 * n + 16.
> 
> Instead use 32 bytes to build the filenames from encrypted data when
> directory is scrambled.
> 
> The drawback is the scrambled filenames change after applying the patch.
> Consider an encrypted directory with:
> 
> ls -il
> total 8
> 1177380 -rw-r--r--. 1 root root 0 Apr 18 12:10
> system@framework@boot-telephony-common.art.crc
> 1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10
> system@framework@boot-telephony-common.oat.crc
> 
> Once the key is invalidated, without the patch, ls -li produces:
> 1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10
> _a1Psh01n8FdhC8s9pUywlAyFzlz7n6C3
> 1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10
> _wJS,0akq14ehC8s9pUywlAyFzlz7n6C3
> 
> Both files show with the same inode.
> 
> After the patch, the names are different, but the inode information is
> now correct:
> ls -li
> 1177380 -rw-r--r--. 1 root root 0 Apr 18 12:10
> _a1Psh01n8FtxbeglW8BqhuthSUxMqh6cFKwz2nSJDXCIXMXOvfqLcD
> 1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10
> _wJS,0akq14eJcuQks7f2Vsg,zE0Jdz98FKwz2nSJDXCIXMXOvfqLcD

It isn't clear to me what the difference is here.  In the first case
(without patch) the last 21 characters of the filename are the same,
but the first 11 characters are still different, so it isn't clear why
that isn't enough to distinguish the files, and why checking the whole
filename isn't enough to properly determine the inode number?

In the second case (with patch) the last 22 characters are also the same.

> Tested only on ext4.
> 
> Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
> ---
> Script to reproduce the error:
> 
> BASE_DIR="~/"
> DIR="${BASE_DIR}/tmp"
> # Create directory.
> mkdir -p "${DIR}"
> echo foobar | e4crypt add_key "${DIR}"
> # Fill directory.
> cd "${DIR}"
> touch system@framework@boot-telephony-common.oat.crc
> touch system@framework@boot-telephony-common.art.crc
> cd ..
> # Check files have different inode.
> ls -il "${DIR}"
> # Invalidate key
> KEY="$(keyctl show | grep $(e4crypt get_policy "${DIR}" | cut -d ':' -f 2) | sed -ne 's/\(.*\) --al.*/\1/p')"
> sync
> keyctl invalidate "${KEY}"
> echo 3 > /proc/sys/vm/drop_caches
> # Once the key is invalidated, both files have the same inode:
> ls -il "${DIR}"
> if [ $(ls -i1 "${DIR}" | cut -d ' ' -f 1 | uniq | wc -l) -eq 1 ] ; then
>  echo same inode!
> fi
> # if we try to remove the directory, we will get an error
> # : Structure needs cleaning
> # rm -rf "${DIR}"
> 
> fs/crypto/fname.c | 20 +++++++++++++-------
> fs/ext4/namei.c   |  4 ++--
> 2 files changed, 15 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/crypto/fname.c b/fs/crypto/fname.c
> index 80bb956e14e5..71ddc3eaa62d 100644
> --- a/fs/crypto/fname.c
> +++ b/fs/crypto/fname.c
> @@ -274,7 +274,7 @@ int fscrypt_fname_disk_to_usr(struct inode *inode,
> 			struct fscrypt_str *oname)
> {
> 	const struct qstr qname = FSTR_TO_QSTR(iname);
> -	char buf[24];
> +	char buf[FS_FNAME_CRYPTO_DIGEST_SIZE + sizeof(u64)];
> 
> 	if (fscrypt_is_dot_dotdot(&qname)) {
> 		oname->name[0] = '.';
> @@ -295,14 +295,19 @@ int fscrypt_fname_disk_to_usr(struct inode *inode,
> 		return 0;
> 	}
> 	if (hash) {
> -		memcpy(buf, &hash, 4);
> -		memcpy(buf + 4, &minor_hash, 4);
> +		memcpy(buf, &hash, sizeof(u32));
> +		memcpy(buf + 4, &minor_hash, sizeof(u32));

Should "4" be replaced with something here?  sizeof(u32), or something else?

> 	} else {
> 		memset(buf, 0, 8);

Likewise, should "8" be replaced with sizeof(u64) as it is with other changes?

> 	}
> -	memcpy(buf + 8, iname->name + iname->len - 16, 16);
> +	memcpy(buf + sizeof(u64),
> +	       iname->name + iname->len - FS_FNAME_CRYPTO_DIGEST_SIZE,
> +	       FS_FNAME_CRYPTO_DIGEST_SIZE);
> 	oname->name[0] = '_';
> -	oname->len = 1 + digest_encode(buf, 24, oname->name + 1);
> +	oname->len = 1 + digest_encode(
> +			buf,
> +			FS_FNAME_CRYPTO_DIGEST_SIZE + sizeof(u64),
> +			oname->name + 1);
> 	return 0;
> }
> EXPORT_SYMBOL(fscrypt_fname_disk_to_usr);
> @@ -375,10 +380,11 @@ int fscrypt_setup_filename(struct inode *dir, const struct qstr *iname,
> 	 */
> 	if (iname->name[0] == '_')
> 		bigname = 1;
> -	if ((bigname && (iname->len != 33)) || (!bigname && (iname->len > 43)))
> +	if ((bigname && iname->len != 55) || (!bigname && (iname->len > 43)))

The "55" constant is no better than "33" in terms of clarity, nor is "43".
Having a (computed) constant for this would be more helpful, for example

   2 * sizeof(buf) + 1

or whatever.

> 		return -ENOENT;
> 
> -	fname->crypto_buf.name = kmalloc(32, GFP_KERNEL);
> +	fname->crypto_buf.name = kmalloc(
> +			FS_FNAME_CRYPTO_DIGEST_SIZE + sizeof(u64), GFP_KERNEL);
> 	if (fname->crypto_buf.name == NULL)
> 		return -ENOMEM;
> 
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index c4a389a6027b..14b2a2335a32 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -1257,8 +1257,8 @@ static inline int ext4_match(struct ext4_filename *fname,
> 			int ret;
> 			if (de->name_len < 16)
> 				return 0;
> -			ret = memcmp(de->name + de->name_len - 16,
> -				     fname->crypto_buf.name + 8, 16);
> +			ret = memcmp(de->name + de->name_len - 32,
> +				     fname->crypto_buf.name + 8, 32);
> 			return (ret == 0) ? 1 : 0;

This could just be:

			return (ret == 0);

> 		}
> 		name = fname->crypto_buf.name;
>                 len = fname->crypto_buf.len;
>         }
> #endif
>         if (de->name_len != len)
>                 return 0;
>         return (memcmp(de->name, name, len) == 0) ? 1 : 0;


And similarly:

	return (memcmp(de->name, name, len) == 0);


Cheers, Andreas

Richard Weinberger April 19, 2017, 1:37 p.m. UTC | #3

On Tue, Apr 18, 2017 at 11:06 PM, Gwendal Grignou <gwendal@chromium.org> wrote:
> If we use only 16 bytes, due to how CBC works, if the names have the
> same beginning, their last ciphertext block (16 bytes) may be identical.
>
> It happens when two file names share the first 16k bytes and both have
> length withn 16 * n + 13 and 16 * n + 16.
>
> Instead use 32 bytes to build the filenames from encrypted data when
> directory is scrambled.
>
> The drawback is the scrambled filenames change after applying the patch.
> Consider an encrypted directory with:
>
> ls -il
> total 8
> 1177380 -rw-r--r--. 1 root root 0 Apr 18 12:10
> system@framework@boot-telephony-common.art.crc
> 1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10
> system@framework@boot-telephony-common.oat.crc
>
> Once the key is invalidated, without the patch, ls -li produces:
> 1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10
> _a1Psh01n8FdhC8s9pUywlAyFzlz7n6C3
> 1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10
> _wJS,0akq14ehC8s9pUywlAyFzlz7n6C3
>
> Both files show with the same inode.
>
> After the patch, the names are different, but the inode information is
> now correct:
> ls -li
> 1177380 -rw-r--r--. 1 root root 0 Apr 18 12:10
> _a1Psh01n8FtxbeglW8BqhuthSUxMqh6cFKwz2nSJDXCIXMXOvfqLcD
> 1177379 -rw-r--r--. 1 root root 0 Apr 18 12:10
> _wJS,0akq14eJcuQks7f2Vsg,zE0Jdz98FKwz2nSJDXCIXMXOvfqLcD
>
> Tested only on ext4.

I hope you classify this patch as RFC then.
We'll have problems when you just develop and test for ext4. :-)

> Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
> ---
> Script to reproduce the error:
>
> BASE_DIR="~/"
> DIR="${BASE_DIR}/tmp"
> # Create directory.
> mkdir -p "${DIR}"
> echo foobar | e4crypt add_key "${DIR}"
> # Fill directory.
> cd "${DIR}"
> touch system@framework@boot-telephony-common.oat.crc
> touch system@framework@boot-telephony-common.art.crc
> cd ..
> # Check files have different inode.
> ls -il "${DIR}"
> # Invalidate key
> KEY="$(keyctl show | grep $(e4crypt get_policy "${DIR}" | cut -d ':' -f 2) | sed -ne 's/\(.*\) --al.*/\1/p')"
> sync
> keyctl invalidate "${KEY}"
> echo 3 > /proc/sys/vm/drop_caches
> # Once the key is invalidated, both files have the same inode:
> ls -il "${DIR}"
> if [ $(ls -i1 "${DIR}" | cut -d ' ' -f 1 | uniq | wc -l) -eq 1 ] ; then
>   echo same inode!
> fi
> # if we try to remove the directory, we will get an error
> # : Structure needs cleaning
> # rm -rf "${DIR}"
>
>  fs/crypto/fname.c | 20 +++++++++++++-------
>  fs/ext4/namei.c   |  4 ++--
>  2 files changed, 15 insertions(+), 9 deletions(-)
>
> diff --git a/fs/crypto/fname.c b/fs/crypto/fname.c
> index 80bb956e14e5..71ddc3eaa62d 100644
> --- a/fs/crypto/fname.c
> +++ b/fs/crypto/fname.c
> @@ -274,7 +274,7 @@ int fscrypt_fname_disk_to_usr(struct inode *inode,
>                         struct fscrypt_str *oname)
>  {
>         const struct qstr qname = FSTR_TO_QSTR(iname);
> -       char buf[24];
> +       char buf[FS_FNAME_CRYPTO_DIGEST_SIZE + sizeof(u64)];
>
>         if (fscrypt_is_dot_dotdot(&qname)) {
>                 oname->name[0] = '.';
> @@ -295,14 +295,19 @@ int fscrypt_fname_disk_to_usr(struct inode *inode,
>                 return 0;
>         }
>         if (hash) {
> -               memcpy(buf, &hash, 4);
> -               memcpy(buf + 4, &minor_hash, 4);
> +               memcpy(buf, &hash, sizeof(u32));
> +               memcpy(buf + 4, &minor_hash, sizeof(u32));
>         } else {
>                 memset(buf, 0, 8);
>         }
> -       memcpy(buf + 8, iname->name + iname->len - 16, 16);
> +       memcpy(buf + sizeof(u64),
> +              iname->name + iname->len - FS_FNAME_CRYPTO_DIGEST_SIZE,
> +              FS_FNAME_CRYPTO_DIGEST_SIZE);
>         oname->name[0] = '_';
> -       oname->len = 1 + digest_encode(buf, 24, oname->name + 1);
> +       oname->len = 1 + digest_encode(
> +                       buf,
> +                       FS_FNAME_CRYPTO_DIGEST_SIZE + sizeof(u64),
> +                       oname->name + 1);
>         return 0;
>  }
>  EXPORT_SYMBOL(fscrypt_fname_disk_to_usr);
> @@ -375,10 +380,11 @@ int fscrypt_setup_filename(struct inode *dir, const struct qstr *iname,
>          */
>         if (iname->name[0] == '_')
>                 bigname = 1;
> -       if ((bigname && (iname->len != 33)) || (!bigname && (iname->len > 43)))
> +       if ((bigname && iname->len != 55) || (!bigname && (iname->len > 43)))
>                 return -ENOENT;
>
> -       fname->crypto_buf.name = kmalloc(32, GFP_KERNEL);
> +       fname->crypto_buf.name = kmalloc(
> +                       FS_FNAME_CRYPTO_DIGEST_SIZE + sizeof(u64), GFP_KERNEL);
>         if (fname->crypto_buf.name == NULL)
>                 return -ENOMEM;
>
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index c4a389a6027b..14b2a2335a32 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -1257,8 +1257,8 @@ static inline int ext4_match(struct ext4_filename *fname,
>                         int ret;
>                         if (de->name_len < 16)
>                                 return 0;
> -                       ret = memcmp(de->name + de->name_len - 16,
> -                                    fname->crypto_buf.name + 8, 16);
> +                       ret = memcmp(de->name + de->name_len - 32,
> +                                    fname->crypto_buf.name + 8, 32);
>                         return (ret == 0) ? 1 : 0;
>                 }
>                 name = fname->crypto_buf.name;

Can the code still be able to read filenames which have been encrypted
using the "old" scheme?

Richard Weinberger April 19, 2017, 1:40 p.m. UTC | #4

Eric,

On Wed, Apr 19, 2017 at 1:01 AM, Eric Biggers <ebiggers3@gmail.com> wrote:
> +Cc linux-f2fs-devel@lists.sourceforge.net
> +Cc linux-mtd@lists.infradead.org (for ubifs)
>
> Hi Gwendal,
>
> On Tue, Apr 18, 2017 at 02:06:42PM -0700, Gwendal Grignou wrote:
>> If we use only 16 bytes, due to how CBC works, if the names have the
>> same beginning, their last ciphertext block (16 bytes) may be identical.
>>
>> It happens when two file names share the first 16k bytes and both have
>> length withn 16 * n + 13 and 16 * n + 16.
>>
>> Instead use 32 bytes to build the filenames from encrypted data when
>> directory is scrambled.
>
> Just some background for people who may be unfamiliar with what's going on here
> (and it may be useful to include some of this in the patch description):
>
> When accessing files without access to the key, userspace needs to operate on a
> filename derived from the ciphertext filename, which contains arbitrary bytes.
>
> But since it's supported to use filenames up to FILENAME_MAX (255 bytes) in
> length when using encryption, we can't always base-64 encode the filename, since
> that may make it too long.
>
> The way this is solved currently is that for filenames with ciphertext length
> greater than 32 bytes, the filesystem provides an 8-byte "cookie" (split into
> 'hash' and 'minor_hash'), which along with the last 16 bytes of the filename
> ciphertext is base-64 encoded into a fixed-length name.  The filesystem returns
> this on readdir.  Then, when a lookup is done, the filesystem translates this
> info back into a specific directory entry.
>
> Since ext4 directory entries do not contain a hash field, ext4 relies only on
> the 16 bytes of ciphertext to distinguish collisions within a directory block.
> Unfortunately, this is broken because with the encryption mode used for
> filenames (CTS), the ciphertext of the last 16-byte block depends only on the
> plaintext up to and including the *second to last* block, not up to the last
> block.  This causes long filenames that differ just near the end to collide.
>
> We could fix this by using the second to last block of ciphertext rather than
> the last one.  However, using the last *two* blocks as you're proposing should
> be fine too.
>
> Of course we could also hash the filename's ciphertext with SHA-256 or
> something, but it's nice to take advantage of the encryption mode, and not have
> to do yet another hash.
>
> I am not too worried about changing the way encrypted filenames are presented,
> since applications are not supposed to rely on this.  (Though we probably should
> be doing something to catch broken applications, like encoding the filenames
> slightly differently after each reboot...)
>
> Strangely, f2fs and ubifs do not use the bytes from the filename at all when
> trying to find a specific directory entry in this case.  So this patch doesn't
> really affect them.  This seems unreliable; perhaps we should introduce a
> function like "fscrypt_name_matches()" which all the filesystems could call?
> Can any of the f2fs and ubifs developers explain why they don't look at any
> bytes from the filename?

Not sure if I understand you correctly, but for long filenames UBIFS
does a lookup
by hash/cookie, not by filename.

Richard Weinberger April 19, 2017, 1:41 p.m. UTC | #5

On Wed, Apr 19, 2017 at 3:37 PM, Richard Weinberger
<richard.weinberger@gmail.com> wrote:
> Can the code still be able to read filenames which have been encrypted
> using the "old" scheme?

Bah, typing is hard.
Should be read: Will the code still be able to read ...

Eric Biggers April 19, 2017, 5:09 p.m. UTC | #6

Hi Richard,

On Wed, Apr 19, 2017 at 03:37:42PM +0200, Richard Weinberger wrote:
> >
> > Tested only on ext4.
> 
> I hope you classify this patch as RFC then.
> We'll have problems when you just develop and test for ext4. :-)
> 

It's a little difficult for people to test stuff on UBIFS without a turn-key
solution like kvm-xfstests where they can just run something like
'kvm-xfstests -c ext4,f2fs,ubifs -g encrypt'.

I did post patches to add UBIFS support to xfstests and kvm-xfstests a few
months ago; maybe you're interested in taking them over and working to get them
merged?

> > diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> > index c4a389a6027b..14b2a2335a32 100644
> > --- a/fs/ext4/namei.c
> > +++ b/fs/ext4/namei.c
> > @@ -1257,8 +1257,8 @@ static inline int ext4_match(struct ext4_filename *fname,
> >                         int ret;
> >                         if (de->name_len < 16)
> >                                 return 0;
> > -                       ret = memcmp(de->name + de->name_len - 16,
> > -                                    fname->crypto_buf.name + 8, 16);
> > +                       ret = memcmp(de->name + de->name_len - 32,
> > +                                    fname->crypto_buf.name + 8, 32);
> >                         return (ret == 0) ? 1 : 0;
> >                 }
> >                 name = fname->crypto_buf.name;
> 
> Can the code still be able to read filenames which have been encrypted
> using the "old" scheme?
> 

The patch only changes the presentation of long encrypted filenames when
accessed without the key.  It doesn't change how filenames are encrypted.

- Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-fscrypt" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Richard Weinberger April 19, 2017, 5:12 p.m. UTC | #7

Eric,

Am 19.04.2017 um 19:09 schrieb Eric Biggers:
> Hi Richard,
> 
> On Wed, Apr 19, 2017 at 03:37:42PM +0200, Richard Weinberger wrote:
>>>
>>> Tested only on ext4.
>>
>> I hope you classify this patch as RFC then.
>> We'll have problems when you just develop and test for ext4. :-)
>>
> 
> It's a little difficult for people to test stuff on UBIFS without a turn-key
> solution like kvm-xfstests where they can just run something like
> 'kvm-xfstests -c ext4,f2fs,ubifs -g encrypt'.
> 
> I did post patches to add UBIFS support to xfstests and kvm-xfstests a few
> months ago; maybe you're interested in taking them over and working to get them
> merged?

I assigned this talk already to David.
He can tell what the status is.

>>> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
>>> index c4a389a6027b..14b2a2335a32 100644
>>> --- a/fs/ext4/namei.c
>>> +++ b/fs/ext4/namei.c
>>> @@ -1257,8 +1257,8 @@ static inline int ext4_match(struct ext4_filename *fname,
>>>                         int ret;
>>>                         if (de->name_len < 16)
>>>                                 return 0;
>>> -                       ret = memcmp(de->name + de->name_len - 16,
>>> -                                    fname->crypto_buf.name + 8, 16);
>>> +                       ret = memcmp(de->name + de->name_len - 32,
>>> +                                    fname->crypto_buf.name + 8, 32);
>>>                         return (ret == 0) ? 1 : 0;
>>>                 }
>>>                 name = fname->crypto_buf.name;
>>
>> Can the code still be able to read filenames which have been encrypted
>> using the "old" scheme?
>>
> 
> The patch only changes the presentation of long encrypted filenames when
> accessed without the key.  It doesn't change how filenames are encrypted.

Thanks for pointing this out.

Thanks,
//richard
--
To unsubscribe from this list: send the line "unsubscribe linux-fscrypt" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Eric Biggers April 19, 2017, 5:16 p.m. UTC | #8

On Wed, Apr 19, 2017 at 03:40:13PM +0200, Richard Weinberger wrote:
> > Strangely, f2fs and ubifs do not use the bytes from the filename at all when
> > trying to find a specific directory entry in this case.  So this patch doesn't
> > really affect them.  This seems unreliable; perhaps we should introduce a
> > function like "fscrypt_name_matches()" which all the filesystems could call?
> > Can any of the f2fs and ubifs developers explain why they don't look at any
> > bytes from the filename?
> 
> Not sure if I understand you correctly, but for long filenames UBIFS
> does a lookup
> by hash/cookie, not by filename.
> 

Well, like I said to Jaegeuk for F2FS, that's what the code does, but _why_?
Like F2FS, it's probably not the case that the hash is sufficient to reliably
identify a directory entry.  Granted, UBIFS does it a lot better than F2FS since
UBIFS uses two 32-bit hashes rather than just one, but it seems the second hash
may be neither necessary nor sufficient to identify a specific directory entry,
and it should be looking at the bytes of ciphertext from the filename instead,
like what ext4 does.  (Provided that is fixed to account for how CTS mode
encryption works.)

- Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-fscrypt" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Richard Weinberger April 19, 2017, 5:21 p.m. UTC | #9

Am 19.04.2017 um 19:16 schrieb Eric Biggers:
> On Wed, Apr 19, 2017 at 03:40:13PM +0200, Richard Weinberger wrote:
>>> Strangely, f2fs and ubifs do not use the bytes from the filename at all when
>>> trying to find a specific directory entry in this case.  So this patch doesn't
>>> really affect them.  This seems unreliable; perhaps we should introduce a
>>> function like "fscrypt_name_matches()" which all the filesystems could call?
>>> Can any of the f2fs and ubifs developers explain why they don't look at any
>>> bytes from the filename?
>>
>> Not sure if I understand you correctly, but for long filenames UBIFS
>> does a lookup
>> by hash/cookie, not by filename.
>>
> 
> Well, like I said to Jaegeuk for F2FS, that's what the code does, but _why_?
> Like F2FS, it's probably not the case that the hash is sufficient to reliably
> identify a directory entry.  Granted, UBIFS does it a lot better than F2FS since
> UBIFS uses two 32-bit hashes rather than just one, but it seems the second hash
> may be neither necessary nor sufficient to identify a specific directory entry,
> and it should be looking at the bytes of ciphertext from the filename instead,
> like what ext4 does.  (Provided that is fixed to account for how CTS mode
> encryption works.)

Let me dig into this, maybe I made a boo boo.
The idea was looking up by the filename hash and resolving
possible collisions using the secondary hash.

Thanks,
//richard
--
To unsubscribe from this list: send the line "unsubscribe linux-fscrypt" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Oberhollenzer April 20, 2017, 11:24 a.m. UTC | #10

On 04/19/2017 07:12 PM, Richard Weinberger wrote:
> Eric,
> 
> Am 19.04.2017 um 19:09 schrieb Eric Biggers:
>> Hi Richard,
>>
>> On Wed, Apr 19, 2017 at 03:37:42PM +0200, Richard Weinberger wrote:
>>>>
>>>> Tested only on ext4.
>>>
>>> I hope you classify this patch as RFC then.
>>> We'll have problems when you just develop and test for ext4. :-)
>>>
>>
>> It's a little difficult for people to test stuff on UBIFS without a turn-key
>> solution like kvm-xfstests where they can just run something like
>> 'kvm-xfstests -c ext4,f2fs,ubifs -g encrypt'.
>>
>> I did post patches to add UBIFS support to xfstests and kvm-xfstests a few
>> months ago; maybe you're interested in taking them over and working to get them
>> merged?
> 
> I assigned this talk already to David.
> He can tell what the status is.
> 
What I have right now is mostly similar to the previous patch for
xfstests-dev that was submitted to the mailing list. I haven't looked
into the xfstests-bld patch yet.

To sumarize what happend so far:

A while ago, I took a look at Erics xfstest patches and the similar
patch series from Dongsheng Yang. Based on those, I got the xfstests
running with UBIFS on nandsim inside a VM with fairly little changes.
AFAIR there were two tests failing, one of them being generic/129,
apparently because it exhausts the space on one of the UBI volumes, and
another one for which a fix was provided.

My work on the xfstests-dev patch was preempted by other projects, but I
briefly got back to it when an O_TMPFILE regression was reported on the
mtd mailing list. This should have been caught by generic/004 but
wasn't because the UBIFS error message didn't match the grep pattern
run on dmesg.

Other such cases may exist.

Today, after rebasing/porting my local changes to upstream xfstets-dev,
I ran the tests against on both Richards UBIFS tree and Linus' tree. On
both kernels I'm now getting 5 failing tests out of 94 (with one of
them being generic/129 again).

Further work on some of the test scripts is required.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-fscrypt" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Richard Weinberger April 24, 2017, 9:19 p.m. UTC | #11

Eric,

On Wed, Apr 19, 2017 at 7:21 PM, Richard Weinberger <richard@nod.at> wrote:
>> Well, like I said to Jaegeuk for F2FS, that's what the code does, but _why_?
>> Like F2FS, it's probably not the case that the hash is sufficient to reliably
>> identify a directory entry.  Granted, UBIFS does it a lot better than F2FS since
>> UBIFS uses two 32-bit hashes rather than just one, but it seems the second hash
>> may be neither necessary nor sufficient to identify a specific directory entry,
>> and it should be looking at the bytes of ciphertext from the filename instead,
>> like what ext4 does.  (Provided that is fixed to account for how CTS mode
>> encryption works.)
>
> Let me dig into this, maybe I made a boo boo.
> The idea was looking up by the filename hash and resolving
> possible collisions using the secondary hash.

In ubifs_lookup() we handle two cases:
1. lookup of a bigname, both fscrypt_name->hash and ->minor_hash are valid.
    ->hash is r5(diskname) and ->minor_hash is the secondary hash, AKA cookie.
   UBIFS fed this hashes in ubifs_readdir() to fscrypt.
2. lookup of a non-bigname, in this case we compute r5(diskname) and
use the diskname
    itself for lookups.

So, in case 1 we avoid collisions by using a 64bit key and in case 2 by using
the 32bit key plus a linear search and memcmp() of diskname.

fscrypt: use 32 bytes of encrypted filename

Commit Message

Comments

Patch