diff mbox series

[RFC,15/17] fscrypt: allow load/save of extent contexts

Message ID fd5c7a78de125737abe447370fe37f9fe90155d6.1672547582.git.sweettea-kernel@dorminy.me (mailing list archive)
State Superseded
Headers show
Series fscrypt: add per-extent encryption keys | expand

Commit Message

Sweet Tea Dorminy Jan. 1, 2023, 5:06 a.m. UTC
The other half of using per-extent infos is saving and loading them from
disk.

Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
---
 fs/crypto/keysetup.c    | 50 +++++++++++++++++++++++++++++++++++++++++
 fs/crypto/policy.c      | 20 +++++++++++++++++
 include/linux/fscrypt.h | 17 ++++++++++++++
 3 files changed, 87 insertions(+)

Comments

Eric Biggers Jan. 2, 2023, 9:47 p.m. UTC | #1
On Sun, Jan 01, 2023 at 12:06:19AM -0500, Sweet Tea Dorminy wrote:
> The other half of using per-extent infos is saving and loading them from
> disk.
> 
> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
> ---
>  fs/crypto/keysetup.c    | 50 +++++++++++++++++++++++++++++++++++++++++
>  fs/crypto/policy.c      | 20 +++++++++++++++++
>  include/linux/fscrypt.h | 17 ++++++++++++++
>  3 files changed, 87 insertions(+)
> 
> diff --git a/fs/crypto/keysetup.c b/fs/crypto/keysetup.c
> index 136156487e8f..82439fb73dd9 100644
> --- a/fs/crypto/keysetup.c
> +++ b/fs/crypto/keysetup.c
> @@ -799,6 +799,56 @@ void fscrypt_free_extent_info(struct fscrypt_info **info_ptr)
>  }
>  EXPORT_SYMBOL_GPL(fscrypt_free_extent_info);
>  
> +/**
> + * fscrypt_load_extent_info() - set up a preexisting extent's fscrypt_info
> + * @inode: the inode to which the extent belongs. Must be encrypted.
> + * @buf: a buffer containing the extent's stored context
> + * @len: the length of the @ctx buffer
> + * @info_ptr: a pointer to return the extent's fscrypt_info into. Should be
> + *	      a pointer to a member of the extent struct, as it will be passed
> + *	      back to the filesystem if key removal demands removal of the
> + *	      info from the extent
> + *
> + * This is not %GFP_NOFS safe, so the caller is expected to call
> + * memalloc_nofs_save/restore() if appropriate.
> + *
> + * Return: 0 if successful, or -errno if it fails.
> + */
> +int fscrypt_load_extent_info(struct inode *inode, void *buf, size_t len,
> +			     struct fscrypt_info **info_ptr)
> +{

When is the filesystem going to call fscrypt_load_extent_info()?

My concern (which we've discussed, but probably I didn't explain clearly enough)
is that the two "naive" solutions don't really work:

Option 1: Set up during the I/O to the extent.  I think this is not feasible
because the full fscrypt_setup_encryption_info() is not safe to do doing I/O.
For example, it allocates memory under GFP_KERNEL, and it uses
crypto_alloc_skcipher() which can involve loading kernel modules.

Option 2: Set up for *all* extents when the file is opened.  I expect that this
would not be feasible either, since a file can have a huge number of extents.

This patchset seems to be assuming one of those options.  (It's not clear
whether it's Option 1 or Option 2, since the caller of
fscrypt_load_extent_info() is not included in the patchset.)

That leaves the option I suggested, which probably I didn't explain clearly
enough: split up key setup so that part can be done when opening the file, and
part can be done during I/O.  Specifically, when opening the file, preallocate
some number of crypto_skcipher objects.  This would be limited to a fixed
number, like 128, even if the file has thousands of extents.  Then, when doing
I/O to an extent, temporarily take a crypto_skcipher from the cache, derive the
extent's key using fscrypt_hkdf_expand(), and set it in the crypto_skcipher
using crypto_skcipher_setkey().

That way, during I/O only fscrypt_hkdf_expand() and crypto_skcipher_setkey()
would be needed.  Those are fairly safe to call during I/O, in contrast to
crypto_alloc_skcipher() which is really problematic to call during I/O.

Of course, it will still be somewhat expensive to derive and set a key.  So it
might also make sense to maintain a map that maps (master key, extent nonce,
encryption mode) to the corresponding cached crypto_skcipher, if any, so that an
already-prepared one can be used when possible.

By the way, blk-crypto-fallback (block/blk-crypto-fallback.c) does something
similar, as it had to solve a similar problem.  The way it was solved is to
require that blk_crypto_fallback_start_using_mode() be called to preallocate the
crypto_skcipher objects for a given encryption mode.

You actually could just use blk-crypto-fallback to do the en/decryption for you,
if you wanted to.  I.e., you could use the blk-crypto API for en/decryption,
instead of going directly through crypto_skcipher.  (Note that currently
blk-crypto is opt-in via the "inlinecrypt" mount option; it's not used by
default.  But it doesn't *have* to be that way; it could just be always used.
It would be necessary to ensure that CONFIG_BLK_INLINE_ENCRYPTION_FALLBACK gets
selected.)  That would save having to reimplement the caching of crypto_skcipher
objects.  However, key derivation would still need to be done at the filesystem
level, so it probably would still make sense to cache derived keys.

- Eric
Sweet Tea Dorminy Jan. 2, 2023, 10:31 p.m. UTC | #2
(in which I fail to reply-all the first time)

> 
> When is the filesystem going to call fscrypt_load_extent_info()?
> 
> My concern (which we've discussed, but probably I didn't explain clearly enough)
> is that the two "naive" solutions don't really work:
> 
> Option 1: Set up during the I/O to the extent.  I think this is not feasible
> because the full fscrypt_setup_encryption_info() is not safe to do doing I/O.
> For example, it allocates memory under GFP_KERNEL, and it uses
> crypto_alloc_skcipher() which can involve loading kernel modules.
> 
memalloc_nofs_save()/memalloc_nofs_restore() could do the job of making 
sure allocations use GFP_NOFS, I think? I guess those calls should be in 
fscrypt_load_extent_info() instead of just in the doc...
> 
> That leaves the option I suggested, which probably I didn't explain clearly
> enough: split up key setup so that part can be done when opening the file, and
> part can be done during I/O.  Specifically, when opening the file, preallocate
> some number of crypto_skcipher objects.  This would be limited to a fixed
> number, like 128, even if the file has thousands of extents.  Then, when doing
> I/O to an extent, temporarily take a crypto_skcipher from the cache, derive the
> extent's key using fscrypt_hkdf_expand(), and set it in the crypto_skcipher
> using crypto_skcipher_setkey().

I didn't elaborate why it wasn't here and should have. With just this 
patchset, I thought a file only ever has extents with the same 
encryption mode, since an inode can't change encryption mode/key past 
creation at present . So loading the parent dir's fscrypt_info should be 
enough to ensure the module is loaded for the mode of all the extents. I 
suppose I'd need to ensure that reflinks also only reflink extents 
sharing the same encryption mode.

In the future patchset which allows changing the key being used for new 
extents (naming that is hard), I had envisioned requiring the filesystem 
to provide a list of enc modes used by an inode when opening, and then 
fscrypt_file_open() could make sure all the necessary modules are loaded 
for those modes.

> 
> That way, during I/O only fscrypt_hkdf_expand() and crypto_skcipher_setkey()
> would be needed.  Those are fairly safe to call during I/O, in contrast to
> crypto_alloc_skcipher() which is really problematic to call during I/O.

Could we use a mempool of skciphers for all of fscrypt, or should it 
only be for extents and be on a per-file basis?

> 
> Of course, it will still be somewhat expensive to derive and set a key.  So it
> might also make sense to maintain a map that maps (master key, extent nonce,
> encryption mode) to the corresponding cached crypto_skcipher, if any, so that an
> already-prepared one can be used when possible.
Same question: just for extent infos, or can this be generalized to all 
infos?


> 
> By the way, blk-crypto-fallback (block/blk-crypto-fallback.c) does something
> similar, as it had to solve a similar problem.  The way it was solved is to
> require that blk_crypto_fallback_start_using_mode() be called to preallocate the
> crypto_skcipher objects for a given encryption mode.
> 
> You actually could just use blk-crypto-fallback to do the en/decryption for you,
> if you wanted to.  I.e., you could use the blk-crypto API for en/decryption,
> instead of going directly through crypto_skcipher.  (Note that currently
> blk-crypto is opt-in via the "inlinecrypt" mount option; it's not used by
> default.  But it doesn't *have* to be that way; it could just be always used.
> It would be necessary to ensure that CONFIG_BLK_INLINE_ENCRYPTION_FALLBACK gets
> selected.)  That would save having to reimplement the caching of crypto_skcipher
> objects.  However, key derivation would still need to be done at the filesystem
> level, so it probably would still make sense to cache derived keys.
> 
> - Eric


Interesting. I will have to study that more, thanks for the pointer.

Thank you!

Sweet Tea
Eric Biggers Jan. 2, 2023, 10:51 p.m. UTC | #3
On Mon, Jan 02, 2023 at 05:31:02PM -0500, Sweet Tea Dorminy wrote:
> (in which I fail to reply-all the first time)
> 
> > 
> > When is the filesystem going to call fscrypt_load_extent_info()?
> > 
> > My concern (which we've discussed, but probably I didn't explain clearly enough)
> > is that the two "naive" solutions don't really work:
> > 
> > Option 1: Set up during the I/O to the extent.  I think this is not feasible
> > because the full fscrypt_setup_encryption_info() is not safe to do doing I/O.
> > For example, it allocates memory under GFP_KERNEL, and it uses
> > crypto_alloc_skcipher() which can involve loading kernel modules.
> > 
> memalloc_nofs_save()/memalloc_nofs_restore() could do the job of making sure
> allocations use GFP_NOFS, I think? I guess those calls should be in
> fscrypt_load_extent_info() instead of just in the doc...
> > 
> > That leaves the option I suggested, which probably I didn't explain clearly
> > enough: split up key setup so that part can be done when opening the file, and
> > part can be done during I/O.  Specifically, when opening the file, preallocate
> > some number of crypto_skcipher objects.  This would be limited to a fixed
> > number, like 128, even if the file has thousands of extents.  Then, when doing
> > I/O to an extent, temporarily take a crypto_skcipher from the cache, derive the
> > extent's key using fscrypt_hkdf_expand(), and set it in the crypto_skcipher
> > using crypto_skcipher_setkey().
> 
> I didn't elaborate why it wasn't here and should have. With just this
> patchset, I thought a file only ever has extents with the same encryption
> mode, since an inode can't change encryption mode/key past creation at
> present . So loading the parent dir's fscrypt_info should be enough to
> ensure the module is loaded for the mode of all the extents. I suppose I'd
> need to ensure that reflinks also only reflink extents sharing the same
> encryption mode.
> 
> In the future patchset which allows changing the key being used for new
> extents (naming that is hard), I had envisioned requiring the filesystem to
> provide a list of enc modes used by an inode when opening, and then
> fscrypt_file_open() could make sure all the necessary modules are loaded for
> those modes.

"Loading the parent dir's fscrypt_info" isn't relevant here, since the filenames
encryption mode can be different from the contents encryption mode.  It's also
possible for an encrypted file to be located in an unencrypted directory.  Maybe
you meant to be talking about the file itself being opened?

Anyway, crypto_alloc_skcipher() takes a lock (crypto_alg_sem) under which memory
is allocated with GFP_KERNEL.  So neither preloading kernel modules nor
memalloc_nofs_save() helps for it; it's still not GFP_NOFS-safe.

I don't think we should allow files to have extents with different encryption
modes.  Encryption policies will still be a property of files.

> > That way, during I/O only fscrypt_hkdf_expand() and crypto_skcipher_setkey()
> > would be needed.  Those are fairly safe to call during I/O, in contrast to
> > crypto_alloc_skcipher() which is really problematic to call during I/O.
> 
> Could we use a mempool of skciphers for all of fscrypt, or should it only be
> for extents and be on a per-file basis?

It probably should be global, or at least per-master-key, as it would be a
massive overhead to allocate (e.g.) 128 crypto_skcipher objects for every file.
blk-crypto-fallback uses a global cache.

It does introduce a bottleneck and memory that can't be reclaimed, though.  I'd
appreciate any thoughts about other solutions.  Maybe the number of objects in
the cache could be scaled up and down as the number of in-core inodes changes.

> > Of course, it will still be somewhat expensive to derive and set a key.  So it
> > might also make sense to maintain a map that maps (master key, extent nonce,
> > encryption mode) to the corresponding cached crypto_skcipher, if any, so that an
> > already-prepared one can be used when possible.
> Same question: just for extent infos, or can this be generalized to all
> infos?

I think that we should definitely still cache the per-file key in
inode::i_crypt_info and not in some global cache.

- Eric
Sweet Tea Dorminy Jan. 3, 2023, 12:33 a.m. UTC | #4
> 
> Anyway, crypto_alloc_skcipher() takes a lock (crypto_alg_sem) under which memory
> is allocated with GFP_KERNEL.  So neither preloading kernel modules nor
> memalloc_nofs_save() helps for it; it's still not GFP_NOFS-safe.

I'm still confused. My understanding is that memalloc_nofs_save() means 
all allocations on that thread until memalloc_nofs_restore() is called 
effectively gets GFP_NOFS appended to the allocation flags. So since 
crypto_alloc_skcipher()'s allocation appears to be on the same thread as 
we'd be calling memalloc_nofs_save/restore(), it would presumably get 
allocated as though it had flags GFP_KERNEL | GFP_NOFS, even though the 
call is kzalloc(..., GFP_KERNEL, ...).

I don't understand how the lock would make a difference. Can you elaborate?

Sorry for my confusion...

Sweet Tea
Eric Biggers Jan. 3, 2023, 12:47 a.m. UTC | #5
On Mon, Jan 02, 2023 at 07:33:15PM -0500, Sweet Tea Dorminy wrote:
> 
> > 
> > Anyway, crypto_alloc_skcipher() takes a lock (crypto_alg_sem) under which memory
> > is allocated with GFP_KERNEL.  So neither preloading kernel modules nor
> > memalloc_nofs_save() helps for it; it's still not GFP_NOFS-safe.
> 
> I'm still confused. My understanding is that memalloc_nofs_save() means all
> allocations on that thread until memalloc_nofs_restore() is called
> effectively gets GFP_NOFS appended to the allocation flags. So since
> crypto_alloc_skcipher()'s allocation appears to be on the same thread as
> we'd be calling memalloc_nofs_save/restore(), it would presumably get
> allocated as though it had flags GFP_KERNEL | GFP_NOFS, even though the call
> is kzalloc(..., GFP_KERNEL, ...).
> 
> I don't understand how the lock would make a difference. Can you elaborate?
> 
> Sorry for my confusion...

Other tasks (using the crypto API for another purpose, perhaps totally unrelated
to fs/crypto/) can take crypto_alg_sem without taking the same precaution.  So,
when your task that is running in fs-reclaim context and has used
memalloc_nofs_save() tries to take the same lock, it might be that the lock is
already held by another thread that is waiting for fs-reclaim to complete in
order to satisfy a GFP_KERNEL allocation.

That's a deadlock.

Locks are only GFP_NOFS-safe when everyone agrees to use them that way.

- Eric
Sweet Tea Dorminy Jan. 3, 2023, 1:23 a.m. UTC | #6
On 1/2/23 19:47, Eric Biggers wrote:
> On Mon, Jan 02, 2023 at 07:33:15PM -0500, Sweet Tea Dorminy wrote:
>>
>>>
>>> Anyway, crypto_alloc_skcipher() takes a lock (crypto_alg_sem) under which memory
>>> is allocated with GFP_KERNEL.  So neither preloading kernel modules nor
>>> memalloc_nofs_save() helps for it; it's still not GFP_NOFS-safe.
>>
>> I'm still confused. My understanding is that memalloc_nofs_save() means all
>> allocations on that thread until memalloc_nofs_restore() is called
>> effectively gets GFP_NOFS appended to the allocation flags. So since
>> crypto_alloc_skcipher()'s allocation appears to be on the same thread as
>> we'd be calling memalloc_nofs_save/restore(), it would presumably get
>> allocated as though it had flags GFP_KERNEL | GFP_NOFS, even though the call
>> is kzalloc(..., GFP_KERNEL, ...).
>>
>> I don't understand how the lock would make a difference. Can you elaborate?
>>
>> Sorry for my confusion...
> 
> Other tasks (using the crypto API for another purpose, perhaps totally unrelated
> to fs/crypto/) can take crypto_alg_sem without taking the same precaution.  So,
> when your task that is running in fs-reclaim context and has used
> memalloc_nofs_save() tries to take the same lock, it might be that the lock is
> already held by another thread that is waiting for fs-reclaim to complete in
> order to satisfy a GFP_KERNEL allocation.
> 
> That's a deadlock.
> 
> Locks are only GFP_NOFS-safe when everyone agrees to use them that way.
> 
> - Eric

Ah that is definitely what I was missing, I've never had to think about 
that interaction. Thank you for explaining!
diff mbox series

Patch

diff --git a/fs/crypto/keysetup.c b/fs/crypto/keysetup.c
index 136156487e8f..82439fb73dd9 100644
--- a/fs/crypto/keysetup.c
+++ b/fs/crypto/keysetup.c
@@ -799,6 +799,56 @@  void fscrypt_free_extent_info(struct fscrypt_info **info_ptr)
 }
 EXPORT_SYMBOL_GPL(fscrypt_free_extent_info);
 
+/**
+ * fscrypt_load_extent_info() - set up a preexisting extent's fscrypt_info
+ * @inode: the inode to which the extent belongs. Must be encrypted.
+ * @buf: a buffer containing the extent's stored context
+ * @len: the length of the @ctx buffer
+ * @info_ptr: a pointer to return the extent's fscrypt_info into. Should be
+ *	      a pointer to a member of the extent struct, as it will be passed
+ *	      back to the filesystem if key removal demands removal of the
+ *	      info from the extent
+ *
+ * This is not %GFP_NOFS safe, so the caller is expected to call
+ * memalloc_nofs_save/restore() if appropriate.
+ *
+ * Return: 0 if successful, or -errno if it fails.
+ */
+int fscrypt_load_extent_info(struct inode *inode, void *buf, size_t len,
+			     struct fscrypt_info **info_ptr)
+{
+	int res;
+	union fscrypt_context ctx;
+	union fscrypt_policy policy;
+
+	if (!fscrypt_has_encryption_key(inode))
+		return -EINVAL;
+
+	memcpy(&ctx, buf, len);
+
+	res = fscrypt_policy_from_context(&policy, &ctx, res);
+	if (res) {
+		fscrypt_warn(inode,
+			     "Unrecognized or corrupt encryption context");
+		return res;
+	}
+
+	if (!fscrypt_supported_policy(&policy, inode)) {
+		return -EINVAL;
+	}
+
+	res = fscrypt_setup_encryption_info(inode, &policy,
+					    fscrypt_context_nonce(&ctx),
+					    IS_CASEFOLDED(inode) &&
+					    S_ISDIR(inode->i_mode),
+					    info_ptr);
+
+	if (res == -ENOPKG) /* Algorithm unavailable? */
+		res = 0;
+	return res;
+}
+EXPORT_SYMBOL_GPL(fscrypt_load_extent_info);
+
 /**
  * fscrypt_put_encryption_info() - free most of an inode's fscrypt data
  * @inode: an inode being evicted
diff --git a/fs/crypto/policy.c b/fs/crypto/policy.c
index e7de4872d375..aab5edc1155e 100644
--- a/fs/crypto/policy.c
+++ b/fs/crypto/policy.c
@@ -751,6 +751,26 @@  int fscrypt_set_context(struct inode *inode, void *fs_data)
 }
 EXPORT_SYMBOL_GPL(fscrypt_set_context);
 
+/**
+ * fscrypt_set_extent_context() - Set the fscrypt context for an extent
+ * @ci: info from which to fetch policy and nonce
+ * @ctx: where context should be written
+ * @len: the size of ctx
+ *
+ * Given an fscrypt_info belonging to an extent (generated via
+ * fscrypt_prepare_new_extent()), generate a new context and write it to @ctx.
+ * len is checked to be at least FSCRYPT_SET_CONTEXT_MAX_SIZE bytes.
+ *
+ * Return: size of the resulting context or a negative error code.
+ */
+int fscrypt_set_extent_context(struct fscrypt_info *ci, void *ctx, size_t len)
+{
+	if (len < FSCRYPT_SET_CONTEXT_MAX_SIZE)
+		return -EINVAL;
+	return fscrypt_new_context(ctx, &ci->ci_policy, ci->ci_nonce);
+}
+EXPORT_SYMBOL_GPL(fscrypt_set_extent_context);
+
 /**
  * fscrypt_parse_test_dummy_encryption() - parse the test_dummy_encryption mount option
  * @param: the mount option
diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
index 6afdcb27f8a2..47c2df1061c7 100644
--- a/include/linux/fscrypt.h
+++ b/include/linux/fscrypt.h
@@ -324,6 +324,8 @@  int fscrypt_ioctl_get_nonce(struct file *filp, void __user *arg);
 int fscrypt_has_permitted_context(struct inode *parent, struct inode *child);
 int fscrypt_context_for_new_inode(void *ctx, struct inode *inode);
 int fscrypt_set_context(struct inode *inode, void *fs_data);
+int fscrypt_set_extent_context(struct fscrypt_info *info, void *ctx,
+			       size_t len);
 
 struct fscrypt_dummy_policy {
 	const union fscrypt_policy *policy;
@@ -367,6 +369,8 @@  int fscrypt_prepare_new_extent(struct inode *inode,
 			       struct fscrypt_info **info_ptr,
 			       bool *encrypt_ret);
 void fscrypt_free_extent_info(struct fscrypt_info **info_ptr);
+int fscrypt_load_extent_info(struct inode *inode, void *buf, size_t len,
+			     struct fscrypt_info **info_ptr);
 
 /* fname.c */
 int fscrypt_fname_encrypt(const struct inode *inode, const struct qstr *iname,
@@ -532,6 +536,12 @@  static inline int fscrypt_set_context(struct inode *inode, void *fs_data)
 	return -EOPNOTSUPP;
 }
 
+static inline int fscrypt_set_extent_context(struct fscrypt_info *info,
+					     void *ctx, size_t len)
+{
+	return -EOPNOTSUPP;
+}
+
 struct fscrypt_dummy_policy {
 };
 
@@ -638,6 +648,13 @@  static inline void fscrypt_free_extent_info(struct fscrypt_info **info_ptr)
 {
 }
 
+static inline int fscrypt_load_extent_info(struct inode *inode, void *buf,
+					   size_t len,
+					   struct fscrypt_info **info_ptr)
+{
+	return -EOPNOTSUPP;
+}
+
  /* fname.c */
 static inline int fscrypt_setup_filename(struct inode *dir,
 					 const struct qstr *iname,