mbox series

[RFC,v10,00/48] ceph+fscrypt: full support

Message ID 20220111191608.88762-1-jlayton@kernel.org (mailing list archive)
Headers show
Series ceph+fscrypt: full support | expand

Message

Jeff Layton Jan. 11, 2022, 7:15 p.m. UTC
This patchset represents a (mostly) complete rough draft of fscrypt
support for cephfs. The context, filename and symlink support is more or
less the same as the versions posted before, and comprise the first half
of the patches.

The new bits here are the size handling changes and support for content
encryption, in buffered, direct and synchronous codepaths. Much of this
code is still very rough and needs a lot of cleanup work.

fscrypt support relies on some MDS changes that are being tracked here:

    https://github.com/ceph/ceph/pull/43588

In particular, this PR adds some new opaque fields in the inode that we
use to store fscrypt-specific information, like the context and the real
size of a file. That is slated to be merged for the upcoming Quincy
release (which is sometime this northern spring).

There are still some notable bugs:

1/ we've identified a few more potential races in truncate handling
which will probably necessitate a protocol change, as well as changes to
the MDS and kclient patchsets. The good news is that we think we have
an approach that will resolve this.

2/ the kclient doesn't handle reading sparse regions in OSD objects
properly yet. The client can end up writing to a non-zero offset in a
non-existent object. Then, if the client tries to read the written
region back later, it'll get back zeroes and give you garbage when you
try to decrypt them.

It turns out that the OSD already supports a SPARSE_READ operation, so
I'm working on implementing that in the kclient to make it not try to
decrypt the sparse regions.

Still, I was able to run xfstests on this set yesterday. Bug #2 above
prevented all of the tests from passing, but it didn't oops! I call that
progress! Given that, I figured this is a good time to post what I have
so far.

Note that the buffered I/O changes in this set are not suitable for
merge and will likely end up being discarded. We need to plumb the
encryption in at the netfs layer, so that we can store encrypted data
in fscache.

The non-buffered codepaths will likely also need substantial changes
before merging. It may be simpler to just move that into the netfs layer
too as cifs will need something similar anyway.

My goal is to get most of this into v5.18, but v5.19 might be more
realistiv. Hopefully I'll have a non-RFC patchset to send in a few
weeks.

Special thanks to Xiubo who came through with the MDS patches. Also,
thanks to everyone (especially Eric Biggers) for all of the previous
reviews. It's much appreciated!

Jeff Layton (43):
  vfs: export new_inode_pseudo
  fscrypt: export fscrypt_base64url_encode and fscrypt_base64url_decode
  fscrypt: export fscrypt_fname_encrypt and fscrypt_fname_encrypted_size
  fscrypt: add fscrypt_context_for_new_inode
  ceph: preallocate inode for ops that may create one
  ceph: crypto context handling for ceph
  ceph: parse new fscrypt_auth and fscrypt_file fields in inode traces
  ceph: add fscrypt_* handling to caps.c
  ceph: add ability to set fscrypt_auth via setattr
  ceph: implement -o test_dummy_encryption mount option
  ceph: decode alternate_name in lease info
  ceph: add fscrypt ioctls
  ceph: make ceph_msdc_build_path use ref-walk
  ceph: add encrypted fname handling to ceph_mdsc_build_path
  ceph: send altname in MClientRequest
  ceph: encode encrypted name in dentry release
  ceph: properly set DCACHE_NOKEY_NAME flag in lookup
  ceph: make d_revalidate call fscrypt revalidator for encrypted
    dentries
  ceph: add helpers for converting names for userland presentation
  ceph: add fscrypt support to ceph_fill_trace
  ceph: add support to readdir for encrypted filenames
  ceph: create symlinks with encrypted and base64-encoded targets
  ceph: make ceph_get_name decrypt filenames
  ceph: add a new ceph.fscrypt.auth vxattr
  ceph: add some fscrypt guardrails
  libceph: add CEPH_OSD_OP_ASSERT_VER support
  ceph: size handling for encrypted inodes in cap updates
  ceph: fscrypt_file field handling in MClientRequest messages
  ceph: get file size from fscrypt_file when present in inode traces
  ceph: handle fscrypt fields in cap messages from MDS
  ceph: add infrastructure for file encryption and decryption
  libceph: allow ceph_osdc_new_request to accept a multi-op read
  ceph: disable fallocate for encrypted inodes
  ceph: disable copy offload on encrypted inodes
  ceph: don't use special DIO path for encrypted inodes
  ceph: set encryption context on open
  ceph: align data in pages in ceph_sync_write
  ceph: add read/modify/write to ceph_sync_write
  ceph: plumb in decryption during sync reads
  ceph: set i_blkbits to crypto block size for encrypted inodes
  ceph: add fscrypt decryption support to ceph_netfs_issue_op
  ceph: add encryption support to writepage
  ceph: fscrypt support for writepages

Luis Henriques (1):
  ceph: don't allow changing layout on encrypted files/directories

Xiubo Li (4):
  ceph: add __ceph_get_caps helper support
  ceph: add __ceph_sync_read helper support
  ceph: add object version support for sync read
  ceph: add truncate size handling support for fscrypt

 fs/ceph/Makefile                |   1 +
 fs/ceph/acl.c                   |   4 +-
 fs/ceph/addr.c                  | 128 +++++--
 fs/ceph/caps.c                  | 211 ++++++++++--
 fs/ceph/crypto.c                | 374 +++++++++++++++++++++
 fs/ceph/crypto.h                | 237 +++++++++++++
 fs/ceph/dir.c                   | 209 +++++++++---
 fs/ceph/export.c                |  44 ++-
 fs/ceph/file.c                  | 476 +++++++++++++++++++++-----
 fs/ceph/inode.c                 | 576 +++++++++++++++++++++++++++++---
 fs/ceph/ioctl.c                 |  87 +++++
 fs/ceph/mds_client.c            | 349 ++++++++++++++++---
 fs/ceph/mds_client.h            |  24 +-
 fs/ceph/super.c                 |  90 ++++-
 fs/ceph/super.h                 |  43 ++-
 fs/ceph/xattr.c                 |  29 ++
 fs/crypto/fname.c               |  44 ++-
 fs/crypto/fscrypt_private.h     |   9 +-
 fs/crypto/hooks.c               |   6 +-
 fs/crypto/policy.c              |  35 +-
 fs/inode.c                      |   1 +
 include/linux/ceph/ceph_fs.h    |  21 +-
 include/linux/ceph/osd_client.h |   6 +-
 include/linux/ceph/rados.h      |   4 +
 include/linux/fscrypt.h         |  10 +
 net/ceph/osd_client.c           |  32 +-
 26 files changed, 2700 insertions(+), 350 deletions(-)
 create mode 100644 fs/ceph/crypto.c
 create mode 100644 fs/ceph/crypto.h

Comments

Jeff Layton Jan. 11, 2022, 7:26 p.m. UTC | #1
On Tue, 2022-01-11 at 14:15 -0500, Jeff Layton wrote:
> This patchset represents a (mostly) complete rough draft of fscrypt
> support for cephfs. The context, filename and symlink support is more or
> less the same as the versions posted before, and comprise the first half
> of the patches.
> 
> The new bits here are the size handling changes and support for content
> encryption, in buffered, direct and synchronous codepaths. Much of this
> code is still very rough and needs a lot of cleanup work.
> 
> fscrypt support relies on some MDS changes that are being tracked here:
> 
>     https://github.com/ceph/ceph/pull/43588
> 
> In particular, this PR adds some new opaque fields in the inode that we
> use to store fscrypt-specific information, like the context and the real
> size of a file. That is slated to be merged for the upcoming Quincy
> release (which is sometime this northern spring).
> 
> There are still some notable bugs:
> 
> 1/ we've identified a few more potential races in truncate handling
> which will probably necessitate a protocol change, as well as changes to
> the MDS and kclient patchsets. The good news is that we think we have
> an approach that will resolve this.
> 
> 2/ the kclient doesn't handle reading sparse regions in OSD objects
> properly yet. The client can end up writing to a non-zero offset in a
> non-existent object. Then, if the client tries to read the written
> region back later, it'll get back zeroes and give you garbage when you
> try to decrypt them.
> 
> It turns out that the OSD already supports a SPARSE_READ operation, so
> I'm working on implementing that in the kclient to make it not try to
> decrypt the sparse regions.
> 
> Still, I was able to run xfstests on this set yesterday. Bug #2 above
> prevented all of the tests from passing, but it didn't oops! I call that
> progress! Given that, I figured this is a good time to post what I have
> so far.
> 
> Note that the buffered I/O changes in this set are not suitable for
> merge and will likely end up being discarded. We need to plumb the
> encryption in at the netfs layer, so that we can store encrypted data
> in fscache.
> 
> The non-buffered codepaths will likely also need substantial changes
> before merging. It may be simpler to just move that into the netfs layer
> too as cifs will need something similar anyway.
> 
> My goal is to get most of this into v5.18, but v5.19 might be more
> realistiv. Hopefully I'll have a non-RFC patchset to send in a few
> weeks.
> 
> Special thanks to Xiubo who came through with the MDS patches. Also,
> thanks to everyone (especially Eric Biggers) for all of the previous
> reviews. It's much appreciated!
> 
> Jeff Layton (43):
>   vfs: export new_inode_pseudo
>   fscrypt: export fscrypt_base64url_encode and fscrypt_base64url_decode
>   fscrypt: export fscrypt_fname_encrypt and fscrypt_fname_encrypted_size
>   fscrypt: add fscrypt_context_for_new_inode
>   ceph: preallocate inode for ops that may create one
>   ceph: crypto context handling for ceph
>   ceph: parse new fscrypt_auth and fscrypt_file fields in inode traces
>   ceph: add fscrypt_* handling to caps.c
>   ceph: add ability to set fscrypt_auth via setattr
>   ceph: implement -o test_dummy_encryption mount option
>   ceph: decode alternate_name in lease info
>   ceph: add fscrypt ioctls
>   ceph: make ceph_msdc_build_path use ref-walk
>   ceph: add encrypted fname handling to ceph_mdsc_build_path
>   ceph: send altname in MClientRequest
>   ceph: encode encrypted name in dentry release
>   ceph: properly set DCACHE_NOKEY_NAME flag in lookup
>   ceph: make d_revalidate call fscrypt revalidator for encrypted
>     dentries
>   ceph: add helpers for converting names for userland presentation
>   ceph: add fscrypt support to ceph_fill_trace
>   ceph: add support to readdir for encrypted filenames
>   ceph: create symlinks with encrypted and base64-encoded targets
>   ceph: make ceph_get_name decrypt filenames
>   ceph: add a new ceph.fscrypt.auth vxattr
>   ceph: add some fscrypt guardrails
>   libceph: add CEPH_OSD_OP_ASSERT_VER support
>   ceph: size handling for encrypted inodes in cap updates
>   ceph: fscrypt_file field handling in MClientRequest messages
>   ceph: get file size from fscrypt_file when present in inode traces
>   ceph: handle fscrypt fields in cap messages from MDS
>   ceph: add infrastructure for file encryption and decryption
>   libceph: allow ceph_osdc_new_request to accept a multi-op read
>   ceph: disable fallocate for encrypted inodes
>   ceph: disable copy offload on encrypted inodes
>   ceph: don't use special DIO path for encrypted inodes
>   ceph: set encryption context on open
>   ceph: align data in pages in ceph_sync_write
>   ceph: add read/modify/write to ceph_sync_write
>   ceph: plumb in decryption during sync reads
>   ceph: set i_blkbits to crypto block size for encrypted inodes
>   ceph: add fscrypt decryption support to ceph_netfs_issue_op
>   ceph: add encryption support to writepage
>   ceph: fscrypt support for writepages
> 
> Luis Henriques (1):
>   ceph: don't allow changing layout on encrypted files/directories
> 
> Xiubo Li (4):
>   ceph: add __ceph_get_caps helper support
>   ceph: add __ceph_sync_read helper support
>   ceph: add object version support for sync read
>   ceph: add truncate size handling support for fscrypt
> 
>  fs/ceph/Makefile                |   1 +
>  fs/ceph/acl.c                   |   4 +-
>  fs/ceph/addr.c                  | 128 +++++--
>  fs/ceph/caps.c                  | 211 ++++++++++--
>  fs/ceph/crypto.c                | 374 +++++++++++++++++++++
>  fs/ceph/crypto.h                | 237 +++++++++++++
>  fs/ceph/dir.c                   | 209 +++++++++---
>  fs/ceph/export.c                |  44 ++-
>  fs/ceph/file.c                  | 476 +++++++++++++++++++++-----
>  fs/ceph/inode.c                 | 576 +++++++++++++++++++++++++++++---
>  fs/ceph/ioctl.c                 |  87 +++++
>  fs/ceph/mds_client.c            | 349 ++++++++++++++++---
>  fs/ceph/mds_client.h            |  24 +-
>  fs/ceph/super.c                 |  90 ++++-
>  fs/ceph/super.h                 |  43 ++-
>  fs/ceph/xattr.c                 |  29 ++
>  fs/crypto/fname.c               |  44 ++-
>  fs/crypto/fscrypt_private.h     |   9 +-
>  fs/crypto/hooks.c               |   6 +-
>  fs/crypto/policy.c              |  35 +-
>  fs/inode.c                      |   1 +
>  include/linux/ceph/ceph_fs.h    |  21 +-
>  include/linux/ceph/osd_client.h |   6 +-
>  include/linux/ceph/rados.h      |   4 +
>  include/linux/fscrypt.h         |  10 +
>  net/ceph/osd_client.c           |  32 +-
>  26 files changed, 2700 insertions(+), 350 deletions(-)
>  create mode 100644 fs/ceph/crypto.c
>  create mode 100644 fs/ceph/crypto.h
> 

I should also mention that I've pushed this series into a new
wip-fscrypt branch in the ceph-client tree for anyone that wants to
check it out.

    https://github.com/ceph/ceph-client/commits/wip-fscrypt

I can't recommend this for general use yet until the data corruption
bugs are fixed, of course.
Eric Biggers Jan. 27, 2022, 2:14 a.m. UTC | #2
On Tue, Jan 11, 2022 at 02:15:20PM -0500, Jeff Layton wrote:
> Still, I was able to run xfstests on this set yesterday. Bug #2 above
> prevented all of the tests from passing, but it didn't oops! I call that
> progress! Given that, I figured this is a good time to post what I have
> so far.

One question: what sort of testing are you doing to show that the file contents
and filenames being stored (i.e., sent by the client to the server in this case)
have been encrypted correctly?  xfstests has tests that verify this for block
device based filesystems; are you doing any equivalent testing?

- Eric
Jeff Layton Jan. 27, 2022, 11:08 a.m. UTC | #3
On Wed, 2022-01-26 at 18:14 -0800, Eric Biggers wrote:
> On Tue, Jan 11, 2022 at 02:15:20PM -0500, Jeff Layton wrote:
> > Still, I was able to run xfstests on this set yesterday. Bug #2 above
> > prevented all of the tests from passing, but it didn't oops! I call that
> > progress! Given that, I figured this is a good time to post what I have
> > so far.
> 
> One question: what sort of testing are you doing to show that the file contents
> and filenames being stored (i.e., sent by the client to the server in this case)
> have been encrypted correctly?  xfstests has tests that verify this for block
> device based filesystems; are you doing any equivalent testing?
> 

I've been testing this pretty regularly with xfstests, and the filenames
portion all seems to be working correctly. Parts of the content
encryption also seem to work ok. I'm still working that piece, so I
haven't been able to validate that part yet.

At the moment I'm working on switching the ceph client over to doing
sparse reads, which is necessary in order to be able to handle sparse
writes without filling in unwritten holes.
Eric Biggers Jan. 28, 2022, 8:39 p.m. UTC | #4
On Thu, Jan 27, 2022 at 06:08:40AM -0500, Jeff Layton wrote:
> On Wed, 2022-01-26 at 18:14 -0800, Eric Biggers wrote:
> > On Tue, Jan 11, 2022 at 02:15:20PM -0500, Jeff Layton wrote:
> > > Still, I was able to run xfstests on this set yesterday. Bug #2 above
> > > prevented all of the tests from passing, but it didn't oops! I call that
> > > progress! Given that, I figured this is a good time to post what I have
> > > so far.
> > 
> > One question: what sort of testing are you doing to show that the file contents
> > and filenames being stored (i.e., sent by the client to the server in this case)
> > have been encrypted correctly?  xfstests has tests that verify this for block
> > device based filesystems; are you doing any equivalent testing?
> > 
> 
> I've been testing this pretty regularly with xfstests, and the filenames
> portion all seems to be working correctly. Parts of the content
> encryption also seem to work ok. I'm still working that piece, so I
> haven't been able to validate that part yet.
> 
> At the moment I'm working on switching the ceph client over to doing
> sparse reads, which is necessary in order to be able to handle sparse
> writes without filling in unwritten holes.

To clarify, I'm asking about the correctness of the ciphertext written to
"disk", not about the user-visible filesystem behavior which is something
different (but also super important as well, of course).  xfstests includes both
types of tests.

Grepping for _verify_ciphertext_for_encryption_policy in xfstests will show the
tests that verify the ciphertext written to disk.  I doubt that you're running
those, as they rely on a block device.  So you'll need to write some equivalent
tests.  In a pinch, you could simply check that the ciphertext is random rather
than correct (that would at least show that it's not plaintext) like what
generic/399 does.  But actually verifying its correctness would be ideal to
ensure that nothing went wrong along the way.

- Eric
Jeff Layton Jan. 28, 2022, 8:47 p.m. UTC | #5
On Fri, 2022-01-28 at 12:39 -0800, Eric Biggers wrote:
> On Thu, Jan 27, 2022 at 06:08:40AM -0500, Jeff Layton wrote:
> > On Wed, 2022-01-26 at 18:14 -0800, Eric Biggers wrote:
> > > On Tue, Jan 11, 2022 at 02:15:20PM -0500, Jeff Layton wrote:
> > > > Still, I was able to run xfstests on this set yesterday. Bug #2 above
> > > > prevented all of the tests from passing, but it didn't oops! I call that
> > > > progress! Given that, I figured this is a good time to post what I have
> > > > so far.
> > > 
> > > One question: what sort of testing are you doing to show that the file contents
> > > and filenames being stored (i.e., sent by the client to the server in this case)
> > > have been encrypted correctly?  xfstests has tests that verify this for block
> > > device based filesystems; are you doing any equivalent testing?
> > > 
> > 
> > I've been testing this pretty regularly with xfstests, and the filenames
> > portion all seems to be working correctly. Parts of the content
> > encryption also seem to work ok. I'm still working that piece, so I
> > haven't been able to validate that part yet.
> > 
> > At the moment I'm working on switching the ceph client over to doing
> > sparse reads, which is necessary in order to be able to handle sparse
> > writes without filling in unwritten holes.
> 
> To clarify, I'm asking about the correctness of the ciphertext written to
> "disk", not about the user-visible filesystem behavior which is something
> different (but also super important as well, of course).  xfstests includes both
> types of tests.
> 
> Grepping for _verify_ciphertext_for_encryption_policy in xfstests will show the
> tests that verify the ciphertext written to disk.  I doubt that you're running
> those, as they rely on a block device.  So you'll need to write some equivalent
> tests.  In a pinch, you could simply check that the ciphertext is random rather
> than correct (that would at least show that it's not plaintext) like what
> generic/399 does.  But actually verifying its correctness would be ideal to
> ensure that nothing went wrong along the way.
> 

Got it. Yes, that would be a good thing. I'll have to see what I can do
once I get to the point of a fully-functioning prototype.
Xiubo Li Feb. 14, 2022, 9:37 a.m. UTC | #6
Hi Jeff,

I am using the 'wip-fscrypt' branch to test other issue and hit:

cp: cannot access './dir___683': No buffer space available
cp: cannot access './dir___686': No buffer space available
cp: cannot access './dir___687': No buffer space available
cp: cannot access './dir___688': No buffer space available
cp: cannot access './dir___689': No buffer space available
cp: cannot access './dir___693': No buffer space available

...

[root@lxbceph1 kcephfs]# diff ./dir___997 /data/backup/kernel/dir___997
diff: ./dir___997: No buffer space available


The dmesg logs:

<7>[ 1256.918228] ceph:  do_getattr inode 0000000089964a71 mask AsXsFs 
mode 040755
<7>[ 1256.918232] ceph:  __ceph_caps_issued_mask ino 0x100000009be cap 
0000000014f1c64b issued pAsLsXsFs (mask AsXsFs)
<7>[ 1256.918237] ceph:  __touch_cap 0000000089964a71 cap 
0000000014f1c64b mds0
<7>[ 1256.918250] ceph:  readdir 0000000089964a71 file 00000000065cb689 
pos 0
<7>[ 1256.918254] ceph:  readdir off 0 -> '.'
<7>[ 1256.918258] ceph:  readdir off 1 -> '..'
<4>[ 1256.918262] fscrypt (ceph, inode 1099511630270): Error -105 
getting encryption context
<7>[ 1256.918269] ceph:  readdir 0000000089964a71 file 00000000065cb689 
pos 2
<4>[ 1256.918273] fscrypt (ceph, inode 1099511630270): Error -105 
getting encryption context
<7>[ 1256.918288] ceph:  release inode 0000000089964a71 dir file 
00000000065cb689
<7>[ 1256.918310] ceph:  __ceph_caps_issued_mask ino 0x1 cap 
00000000aa2afb8b issued pAsLsXsFs (mask Fs)
<7>[ 1257.574593] ceph:  mdsc delayed_work

I did nothing about the fscrypt after mounting the kclient, just create 
2000 directories and then made some snapshots on the root dir and then 
try to copy the root directory to the backup.

- Xiubo

On 1/12/22 3:15 AM, Jeff Layton wrote:
> This patchset represents a (mostly) complete rough draft of fscrypt
> support for cephfs. The context, filename and symlink support is more or
> less the same as the versions posted before, and comprise the first half
> of the patches.
>
> The new bits here are the size handling changes and support for content
> encryption, in buffered, direct and synchronous codepaths. Much of this
> code is still very rough and needs a lot of cleanup work.
>
> fscrypt support relies on some MDS changes that are being tracked here:
>
>      https://github.com/ceph/ceph/pull/43588
>
> In particular, this PR adds some new opaque fields in the inode that we
> use to store fscrypt-specific information, like the context and the real
> size of a file. That is slated to be merged for the upcoming Quincy
> release (which is sometime this northern spring).
>
> There are still some notable bugs:
>
> 1/ we've identified a few more potential races in truncate handling
> which will probably necessitate a protocol change, as well as changes to
> the MDS and kclient patchsets. The good news is that we think we have
> an approach that will resolve this.
>
> 2/ the kclient doesn't handle reading sparse regions in OSD objects
> properly yet. The client can end up writing to a non-zero offset in a
> non-existent object. Then, if the client tries to read the written
> region back later, it'll get back zeroes and give you garbage when you
> try to decrypt them.
>
> It turns out that the OSD already supports a SPARSE_READ operation, so
> I'm working on implementing that in the kclient to make it not try to
> decrypt the sparse regions.
>
> Still, I was able to run xfstests on this set yesterday. Bug #2 above
> prevented all of the tests from passing, but it didn't oops! I call that
> progress! Given that, I figured this is a good time to post what I have
> so far.
>
> Note that the buffered I/O changes in this set are not suitable for
> merge and will likely end up being discarded. We need to plumb the
> encryption in at the netfs layer, so that we can store encrypted data
> in fscache.
>
> The non-buffered codepaths will likely also need substantial changes
> before merging. It may be simpler to just move that into the netfs layer
> too as cifs will need something similar anyway.
>
> My goal is to get most of this into v5.18, but v5.19 might be more
> realistiv. Hopefully I'll have a non-RFC patchset to send in a few
> weeks.
>
> Special thanks to Xiubo who came through with the MDS patches. Also,
> thanks to everyone (especially Eric Biggers) for all of the previous
> reviews. It's much appreciated!
>
> Jeff Layton (43):
>    vfs: export new_inode_pseudo
>    fscrypt: export fscrypt_base64url_encode and fscrypt_base64url_decode
>    fscrypt: export fscrypt_fname_encrypt and fscrypt_fname_encrypted_size
>    fscrypt: add fscrypt_context_for_new_inode
>    ceph: preallocate inode for ops that may create one
>    ceph: crypto context handling for ceph
>    ceph: parse new fscrypt_auth and fscrypt_file fields in inode traces
>    ceph: add fscrypt_* handling to caps.c
>    ceph: add ability to set fscrypt_auth via setattr
>    ceph: implement -o test_dummy_encryption mount option
>    ceph: decode alternate_name in lease info
>    ceph: add fscrypt ioctls
>    ceph: make ceph_msdc_build_path use ref-walk
>    ceph: add encrypted fname handling to ceph_mdsc_build_path
>    ceph: send altname in MClientRequest
>    ceph: encode encrypted name in dentry release
>    ceph: properly set DCACHE_NOKEY_NAME flag in lookup
>    ceph: make d_revalidate call fscrypt revalidator for encrypted
>      dentries
>    ceph: add helpers for converting names for userland presentation
>    ceph: add fscrypt support to ceph_fill_trace
>    ceph: add support to readdir for encrypted filenames
>    ceph: create symlinks with encrypted and base64-encoded targets
>    ceph: make ceph_get_name decrypt filenames
>    ceph: add a new ceph.fscrypt.auth vxattr
>    ceph: add some fscrypt guardrails
>    libceph: add CEPH_OSD_OP_ASSERT_VER support
>    ceph: size handling for encrypted inodes in cap updates
>    ceph: fscrypt_file field handling in MClientRequest messages
>    ceph: get file size from fscrypt_file when present in inode traces
>    ceph: handle fscrypt fields in cap messages from MDS
>    ceph: add infrastructure for file encryption and decryption
>    libceph: allow ceph_osdc_new_request to accept a multi-op read
>    ceph: disable fallocate for encrypted inodes
>    ceph: disable copy offload on encrypted inodes
>    ceph: don't use special DIO path for encrypted inodes
>    ceph: set encryption context on open
>    ceph: align data in pages in ceph_sync_write
>    ceph: add read/modify/write to ceph_sync_write
>    ceph: plumb in decryption during sync reads
>    ceph: set i_blkbits to crypto block size for encrypted inodes
>    ceph: add fscrypt decryption support to ceph_netfs_issue_op
>    ceph: add encryption support to writepage
>    ceph: fscrypt support for writepages
>
> Luis Henriques (1):
>    ceph: don't allow changing layout on encrypted files/directories
>
> Xiubo Li (4):
>    ceph: add __ceph_get_caps helper support
>    ceph: add __ceph_sync_read helper support
>    ceph: add object version support for sync read
>    ceph: add truncate size handling support for fscrypt
>
>   fs/ceph/Makefile                |   1 +
>   fs/ceph/acl.c                   |   4 +-
>   fs/ceph/addr.c                  | 128 +++++--
>   fs/ceph/caps.c                  | 211 ++++++++++--
>   fs/ceph/crypto.c                | 374 +++++++++++++++++++++
>   fs/ceph/crypto.h                | 237 +++++++++++++
>   fs/ceph/dir.c                   | 209 +++++++++---
>   fs/ceph/export.c                |  44 ++-
>   fs/ceph/file.c                  | 476 +++++++++++++++++++++-----
>   fs/ceph/inode.c                 | 576 +++++++++++++++++++++++++++++---
>   fs/ceph/ioctl.c                 |  87 +++++
>   fs/ceph/mds_client.c            | 349 ++++++++++++++++---
>   fs/ceph/mds_client.h            |  24 +-
>   fs/ceph/super.c                 |  90 ++++-
>   fs/ceph/super.h                 |  43 ++-
>   fs/ceph/xattr.c                 |  29 ++
>   fs/crypto/fname.c               |  44 ++-
>   fs/crypto/fscrypt_private.h     |   9 +-
>   fs/crypto/hooks.c               |   6 +-
>   fs/crypto/policy.c              |  35 +-
>   fs/inode.c                      |   1 +
>   include/linux/ceph/ceph_fs.h    |  21 +-
>   include/linux/ceph/osd_client.h |   6 +-
>   include/linux/ceph/rados.h      |   4 +
>   include/linux/fscrypt.h         |  10 +
>   net/ceph/osd_client.c           |  32 +-
>   26 files changed, 2700 insertions(+), 350 deletions(-)
>   create mode 100644 fs/ceph/crypto.c
>   create mode 100644 fs/ceph/crypto.h
>
Jeff Layton Feb. 14, 2022, 11:33 a.m. UTC | #7
On Mon, 2022-02-14 at 17:37 +0800, Xiubo Li wrote:
> Hi Jeff,
> 
> I am using the 'wip-fscrypt' branch to test other issue and hit:
> 
> cp: cannot access './dir___683': No buffer space available
> cp: cannot access './dir___686': No buffer space available
> cp: cannot access './dir___687': No buffer space available
> cp: cannot access './dir___688': No buffer space available
> cp: cannot access './dir___689': No buffer space available
> cp: cannot access './dir___693': No buffer space available
> 
> ...
> 
> [root@lxbceph1 kcephfs]# diff ./dir___997 /data/backup/kernel/dir___997
> diff: ./dir___997: No buffer space available
> 
> 
> The dmesg logs:
> 
> <7>[ 1256.918228] ceph:  do_getattr inode 0000000089964a71 mask AsXsFs 
> mode 040755
> <7>[ 1256.918232] ceph:  __ceph_caps_issued_mask ino 0x100000009be cap 
> 0000000014f1c64b issued pAsLsXsFs (mask AsXsFs)
> <7>[ 1256.918237] ceph:  __touch_cap 0000000089964a71 cap 
> 0000000014f1c64b mds0
> <7>[ 1256.918250] ceph:  readdir 0000000089964a71 file 00000000065cb689 
> pos 0
> <7>[ 1256.918254] ceph:  readdir off 0 -> '.'
> <7>[ 1256.918258] ceph:  readdir off 1 -> '..'
> <4>[ 1256.918262] fscrypt (ceph, inode 1099511630270): Error -105 
> getting encryption context
> <7>[ 1256.918269] ceph:  readdir 0000000089964a71 file 00000000065cb689 
> pos 2
> <4>[ 1256.918273] fscrypt (ceph, inode 1099511630270): Error -105 
> getting encryption context
> <7>[ 1256.918288] ceph:  release inode 0000000089964a71 dir file 
> 00000000065cb689
> <7>[ 1256.918310] ceph:  __ceph_caps_issued_mask ino 0x1 cap 
> 00000000aa2afb8b issued pAsLsXsFs (mask Fs)
> <7>[ 1257.574593] ceph:  mdsc delayed_work
> 
> I did nothing about the fscrypt after mounting the kclient, just create 
> 2000 directories and then made some snapshots on the root dir and then 
> try to copy the root directory to the backup.
> 
> - Xiubo
> 

That means that ceph_crypt_get_context returned -ENODATA, which it can
do for several different reasons. We probably need to add in some
debugging there to see which one it is...

TBH, I've done absolutely no testing with snapshots, so it's quite
possible there is some interaction there that is causing problems.

-- Jeff

> On 1/12/22 3:15 AM, Jeff Layton wrote:
> > This patchset represents a (mostly) complete rough draft of fscrypt
> > support for cephfs. The context, filename and symlink support is more or
> > less the same as the versions posted before, and comprise the first half
> > of the patches.
> > 
> > The new bits here are the size handling changes and support for content
> > encryption, in buffered, direct and synchronous codepaths. Much of this
> > code is still very rough and needs a lot of cleanup work.
> > 
> > fscrypt support relies on some MDS changes that are being tracked here:
> > 
> >      https://github.com/ceph/ceph/pull/43588
> > 
> > In particular, this PR adds some new opaque fields in the inode that we
> > use to store fscrypt-specific information, like the context and the real
> > size of a file. That is slated to be merged for the upcoming Quincy
> > release (which is sometime this northern spring).
> > 
> > There are still some notable bugs:
> > 
> > 1/ we've identified a few more potential races in truncate handling
> > which will probably necessitate a protocol change, as well as changes to
> > the MDS and kclient patchsets. The good news is that we think we have
> > an approach that will resolve this.
> > 
> > 2/ the kclient doesn't handle reading sparse regions in OSD objects
> > properly yet. The client can end up writing to a non-zero offset in a
> > non-existent object. Then, if the client tries to read the written
> > region back later, it'll get back zeroes and give you garbage when you
> > try to decrypt them.
> > 
> > It turns out that the OSD already supports a SPARSE_READ operation, so
> > I'm working on implementing that in the kclient to make it not try to
> > decrypt the sparse regions.
> > 
> > Still, I was able to run xfstests on this set yesterday. Bug #2 above
> > prevented all of the tests from passing, but it didn't oops! I call that
> > progress! Given that, I figured this is a good time to post what I have
> > so far.
> > 
> > Note that the buffered I/O changes in this set are not suitable for
> > merge and will likely end up being discarded. We need to plumb the
> > encryption in at the netfs layer, so that we can store encrypted data
> > in fscache.
> > 
> > The non-buffered codepaths will likely also need substantial changes
> > before merging. It may be simpler to just move that into the netfs layer
> > too as cifs will need something similar anyway.
> > 
> > My goal is to get most of this into v5.18, but v5.19 might be more
> > realistiv. Hopefully I'll have a non-RFC patchset to send in a few
> > weeks.
> > 
> > Special thanks to Xiubo who came through with the MDS patches. Also,
> > thanks to everyone (especially Eric Biggers) for all of the previous
> > reviews. It's much appreciated!
> > 
> > Jeff Layton (43):
> >    vfs: export new_inode_pseudo
> >    fscrypt: export fscrypt_base64url_encode and fscrypt_base64url_decode
> >    fscrypt: export fscrypt_fname_encrypt and fscrypt_fname_encrypted_size
> >    fscrypt: add fscrypt_context_for_new_inode
> >    ceph: preallocate inode for ops that may create one
> >    ceph: crypto context handling for ceph
> >    ceph: parse new fscrypt_auth and fscrypt_file fields in inode traces
> >    ceph: add fscrypt_* handling to caps.c
> >    ceph: add ability to set fscrypt_auth via setattr
> >    ceph: implement -o test_dummy_encryption mount option
> >    ceph: decode alternate_name in lease info
> >    ceph: add fscrypt ioctls
> >    ceph: make ceph_msdc_build_path use ref-walk
> >    ceph: add encrypted fname handling to ceph_mdsc_build_path
> >    ceph: send altname in MClientRequest
> >    ceph: encode encrypted name in dentry release
> >    ceph: properly set DCACHE_NOKEY_NAME flag in lookup
> >    ceph: make d_revalidate call fscrypt revalidator for encrypted
> >      dentries
> >    ceph: add helpers for converting names for userland presentation
> >    ceph: add fscrypt support to ceph_fill_trace
> >    ceph: add support to readdir for encrypted filenames
> >    ceph: create symlinks with encrypted and base64-encoded targets
> >    ceph: make ceph_get_name decrypt filenames
> >    ceph: add a new ceph.fscrypt.auth vxattr
> >    ceph: add some fscrypt guardrails
> >    libceph: add CEPH_OSD_OP_ASSERT_VER support
> >    ceph: size handling for encrypted inodes in cap updates
> >    ceph: fscrypt_file field handling in MClientRequest messages
> >    ceph: get file size from fscrypt_file when present in inode traces
> >    ceph: handle fscrypt fields in cap messages from MDS
> >    ceph: add infrastructure for file encryption and decryption
> >    libceph: allow ceph_osdc_new_request to accept a multi-op read
> >    ceph: disable fallocate for encrypted inodes
> >    ceph: disable copy offload on encrypted inodes
> >    ceph: don't use special DIO path for encrypted inodes
> >    ceph: set encryption context on open
> >    ceph: align data in pages in ceph_sync_write
> >    ceph: add read/modify/write to ceph_sync_write
> >    ceph: plumb in decryption during sync reads
> >    ceph: set i_blkbits to crypto block size for encrypted inodes
> >    ceph: add fscrypt decryption support to ceph_netfs_issue_op
> >    ceph: add encryption support to writepage
> >    ceph: fscrypt support for writepages
> > 
> > Luis Henriques (1):
> >    ceph: don't allow changing layout on encrypted files/directories
> > 
> > Xiubo Li (4):
> >    ceph: add __ceph_get_caps helper support
> >    ceph: add __ceph_sync_read helper support
> >    ceph: add object version support for sync read
> >    ceph: add truncate size handling support for fscrypt
> > 
> >   fs/ceph/Makefile                |   1 +
> >   fs/ceph/acl.c                   |   4 +-
> >   fs/ceph/addr.c                  | 128 +++++--
> >   fs/ceph/caps.c                  | 211 ++++++++++--
> >   fs/ceph/crypto.c                | 374 +++++++++++++++++++++
> >   fs/ceph/crypto.h                | 237 +++++++++++++
> >   fs/ceph/dir.c                   | 209 +++++++++---
> >   fs/ceph/export.c                |  44 ++-
> >   fs/ceph/file.c                  | 476 +++++++++++++++++++++-----
> >   fs/ceph/inode.c                 | 576 +++++++++++++++++++++++++++++---
> >   fs/ceph/ioctl.c                 |  87 +++++
> >   fs/ceph/mds_client.c            | 349 ++++++++++++++++---
> >   fs/ceph/mds_client.h            |  24 +-
> >   fs/ceph/super.c                 |  90 ++++-
> >   fs/ceph/super.h                 |  43 ++-
> >   fs/ceph/xattr.c                 |  29 ++
> >   fs/crypto/fname.c               |  44 ++-
> >   fs/crypto/fscrypt_private.h     |   9 +-
> >   fs/crypto/hooks.c               |   6 +-
> >   fs/crypto/policy.c              |  35 +-
> >   fs/inode.c                      |   1 +
> >   include/linux/ceph/ceph_fs.h    |  21 +-
> >   include/linux/ceph/osd_client.h |   6 +-
> >   include/linux/ceph/rados.h      |   4 +
> >   include/linux/fscrypt.h         |  10 +
> >   net/ceph/osd_client.c           |  32 +-
> >   26 files changed, 2700 insertions(+), 350 deletions(-)
> >   create mode 100644 fs/ceph/crypto.c
> >   create mode 100644 fs/ceph/crypto.h
> > 
>
Xiubo Li Feb. 14, 2022, 12:08 p.m. UTC | #8
On 2/14/22 7:33 PM, Jeff Layton wrote:
> On Mon, 2022-02-14 at 17:37 +0800, Xiubo Li wrote:
>> Hi Jeff,
>>
>> I am using the 'wip-fscrypt' branch to test other issue and hit:
>>
>> cp: cannot access './dir___683': No buffer space available
>> cp: cannot access './dir___686': No buffer space available
>> cp: cannot access './dir___687': No buffer space available
>> cp: cannot access './dir___688': No buffer space available
>> cp: cannot access './dir___689': No buffer space available
>> cp: cannot access './dir___693': No buffer space available
>>
>> ...
>>
>> [root@lxbceph1 kcephfs]# diff ./dir___997 /data/backup/kernel/dir___997
>> diff: ./dir___997: No buffer space available
>>
>>
>> The dmesg logs:
>>
>> <7>[ 1256.918228] ceph:  do_getattr inode 0000000089964a71 mask AsXsFs
>> mode 040755
>> <7>[ 1256.918232] ceph:  __ceph_caps_issued_mask ino 0x100000009be cap
>> 0000000014f1c64b issued pAsLsXsFs (mask AsXsFs)
>> <7>[ 1256.918237] ceph:  __touch_cap 0000000089964a71 cap
>> 0000000014f1c64b mds0
>> <7>[ 1256.918250] ceph:  readdir 0000000089964a71 file 00000000065cb689
>> pos 0
>> <7>[ 1256.918254] ceph:  readdir off 0 -> '.'
>> <7>[ 1256.918258] ceph:  readdir off 1 -> '..'
>> <4>[ 1256.918262] fscrypt (ceph, inode 1099511630270): Error -105
>> getting encryption context
>> <7>[ 1256.918269] ceph:  readdir 0000000089964a71 file 00000000065cb689
>> pos 2
>> <4>[ 1256.918273] fscrypt (ceph, inode 1099511630270): Error -105
>> getting encryption context
>> <7>[ 1256.918288] ceph:  release inode 0000000089964a71 dir file
>> 00000000065cb689
>> <7>[ 1256.918310] ceph:  __ceph_caps_issued_mask ino 0x1 cap
>> 00000000aa2afb8b issued pAsLsXsFs (mask Fs)
>> <7>[ 1257.574593] ceph:  mdsc delayed_work
>>
>> I did nothing about the fscrypt after mounting the kclient, just create
>> 2000 directories and then made some snapshots on the root dir and then
>> try to copy the root directory to the backup.
>>
>> - Xiubo
>>
> That means that ceph_crypt_get_context returned -ENODATA, which it can

It should be -ENOBUFS.

I am not sure it relates to the snapshot stuff. I will try without the 
snapshot later.

I can debug it later, maybe in next week.

-- Xiubo

> do for several different reasons. We probably need to add in some
> debugging there to see which one it is...
>
> TBH, I've done absolutely no testing with snapshots, so it's quite
> possible there is some interaction there that is causing problems.
>
> -- Jeff
>
>> On 1/12/22 3:15 AM, Jeff Layton wrote:
>>> This patchset represents a (mostly) complete rough draft of fscrypt
>>> support for cephfs. The context, filename and symlink support is more or
>>> less the same as the versions posted before, and comprise the first half
>>> of the patches.
>>>
>>> The new bits here are the size handling changes and support for content
>>> encryption, in buffered, direct and synchronous codepaths. Much of this
>>> code is still very rough and needs a lot of cleanup work.
>>>
>>> fscrypt support relies on some MDS changes that are being tracked here:
>>>
>>>       https://github.com/ceph/ceph/pull/43588
>>>
>>> In particular, this PR adds some new opaque fields in the inode that we
>>> use to store fscrypt-specific information, like the context and the real
>>> size of a file. That is slated to be merged for the upcoming Quincy
>>> release (which is sometime this northern spring).
>>>
>>> There are still some notable bugs:
>>>
>>> 1/ we've identified a few more potential races in truncate handling
>>> which will probably necessitate a protocol change, as well as changes to
>>> the MDS and kclient patchsets. The good news is that we think we have
>>> an approach that will resolve this.
>>>
>>> 2/ the kclient doesn't handle reading sparse regions in OSD objects
>>> properly yet. The client can end up writing to a non-zero offset in a
>>> non-existent object. Then, if the client tries to read the written
>>> region back later, it'll get back zeroes and give you garbage when you
>>> try to decrypt them.
>>>
>>> It turns out that the OSD already supports a SPARSE_READ operation, so
>>> I'm working on implementing that in the kclient to make it not try to
>>> decrypt the sparse regions.
>>>
>>> Still, I was able to run xfstests on this set yesterday. Bug #2 above
>>> prevented all of the tests from passing, but it didn't oops! I call that
>>> progress! Given that, I figured this is a good time to post what I have
>>> so far.
>>>
>>> Note that the buffered I/O changes in this set are not suitable for
>>> merge and will likely end up being discarded. We need to plumb the
>>> encryption in at the netfs layer, so that we can store encrypted data
>>> in fscache.
>>>
>>> The non-buffered codepaths will likely also need substantial changes
>>> before merging. It may be simpler to just move that into the netfs layer
>>> too as cifs will need something similar anyway.
>>>
>>> My goal is to get most of this into v5.18, but v5.19 might be more
>>> realistiv. Hopefully I'll have a non-RFC patchset to send in a few
>>> weeks.
>>>
>>> Special thanks to Xiubo who came through with the MDS patches. Also,
>>> thanks to everyone (especially Eric Biggers) for all of the previous
>>> reviews. It's much appreciated!
>>>
>>> Jeff Layton (43):
>>>     vfs: export new_inode_pseudo
>>>     fscrypt: export fscrypt_base64url_encode and fscrypt_base64url_decode
>>>     fscrypt: export fscrypt_fname_encrypt and fscrypt_fname_encrypted_size
>>>     fscrypt: add fscrypt_context_for_new_inode
>>>     ceph: preallocate inode for ops that may create one
>>>     ceph: crypto context handling for ceph
>>>     ceph: parse new fscrypt_auth and fscrypt_file fields in inode traces
>>>     ceph: add fscrypt_* handling to caps.c
>>>     ceph: add ability to set fscrypt_auth via setattr
>>>     ceph: implement -o test_dummy_encryption mount option
>>>     ceph: decode alternate_name in lease info
>>>     ceph: add fscrypt ioctls
>>>     ceph: make ceph_msdc_build_path use ref-walk
>>>     ceph: add encrypted fname handling to ceph_mdsc_build_path
>>>     ceph: send altname in MClientRequest
>>>     ceph: encode encrypted name in dentry release
>>>     ceph: properly set DCACHE_NOKEY_NAME flag in lookup
>>>     ceph: make d_revalidate call fscrypt revalidator for encrypted
>>>       dentries
>>>     ceph: add helpers for converting names for userland presentation
>>>     ceph: add fscrypt support to ceph_fill_trace
>>>     ceph: add support to readdir for encrypted filenames
>>>     ceph: create symlinks with encrypted and base64-encoded targets
>>>     ceph: make ceph_get_name decrypt filenames
>>>     ceph: add a new ceph.fscrypt.auth vxattr
>>>     ceph: add some fscrypt guardrails
>>>     libceph: add CEPH_OSD_OP_ASSERT_VER support
>>>     ceph: size handling for encrypted inodes in cap updates
>>>     ceph: fscrypt_file field handling in MClientRequest messages
>>>     ceph: get file size from fscrypt_file when present in inode traces
>>>     ceph: handle fscrypt fields in cap messages from MDS
>>>     ceph: add infrastructure for file encryption and decryption
>>>     libceph: allow ceph_osdc_new_request to accept a multi-op read
>>>     ceph: disable fallocate for encrypted inodes
>>>     ceph: disable copy offload on encrypted inodes
>>>     ceph: don't use special DIO path for encrypted inodes
>>>     ceph: set encryption context on open
>>>     ceph: align data in pages in ceph_sync_write
>>>     ceph: add read/modify/write to ceph_sync_write
>>>     ceph: plumb in decryption during sync reads
>>>     ceph: set i_blkbits to crypto block size for encrypted inodes
>>>     ceph: add fscrypt decryption support to ceph_netfs_issue_op
>>>     ceph: add encryption support to writepage
>>>     ceph: fscrypt support for writepages
>>>
>>> Luis Henriques (1):
>>>     ceph: don't allow changing layout on encrypted files/directories
>>>
>>> Xiubo Li (4):
>>>     ceph: add __ceph_get_caps helper support
>>>     ceph: add __ceph_sync_read helper support
>>>     ceph: add object version support for sync read
>>>     ceph: add truncate size handling support for fscrypt
>>>
>>>    fs/ceph/Makefile                |   1 +
>>>    fs/ceph/acl.c                   |   4 +-
>>>    fs/ceph/addr.c                  | 128 +++++--
>>>    fs/ceph/caps.c                  | 211 ++++++++++--
>>>    fs/ceph/crypto.c                | 374 +++++++++++++++++++++
>>>    fs/ceph/crypto.h                | 237 +++++++++++++
>>>    fs/ceph/dir.c                   | 209 +++++++++---
>>>    fs/ceph/export.c                |  44 ++-
>>>    fs/ceph/file.c                  | 476 +++++++++++++++++++++-----
>>>    fs/ceph/inode.c                 | 576 +++++++++++++++++++++++++++++---
>>>    fs/ceph/ioctl.c                 |  87 +++++
>>>    fs/ceph/mds_client.c            | 349 ++++++++++++++++---
>>>    fs/ceph/mds_client.h            |  24 +-
>>>    fs/ceph/super.c                 |  90 ++++-
>>>    fs/ceph/super.h                 |  43 ++-
>>>    fs/ceph/xattr.c                 |  29 ++
>>>    fs/crypto/fname.c               |  44 ++-
>>>    fs/crypto/fscrypt_private.h     |   9 +-
>>>    fs/crypto/hooks.c               |   6 +-
>>>    fs/crypto/policy.c              |  35 +-
>>>    fs/inode.c                      |   1 +
>>>    include/linux/ceph/ceph_fs.h    |  21 +-
>>>    include/linux/ceph/osd_client.h |   6 +-
>>>    include/linux/ceph/rados.h      |   4 +
>>>    include/linux/fscrypt.h         |  10 +
>>>    net/ceph/osd_client.c           |  32 +-
>>>    26 files changed, 2700 insertions(+), 350 deletions(-)
>>>    create mode 100644 fs/ceph/crypto.c
>>>    create mode 100644 fs/ceph/crypto.h
>>>
Luis Henriques Feb. 14, 2022, 5:57 p.m. UTC | #9
Jeff Layton <jlayton@kernel.org> writes:

> This patchset represents a (mostly) complete rough draft of fscrypt
> support for cephfs. The context, filename and symlink support is more or
> less the same as the versions posted before, and comprise the first half
> of the patches.
>
> The new bits here are the size handling changes and support for content
> encryption, in buffered, direct and synchronous codepaths. Much of this
> code is still very rough and needs a lot of cleanup work.
>
> fscrypt support relies on some MDS changes that are being tracked here:
>
>     https://github.com/ceph/ceph/pull/43588
>

Please correct me if I'm wrong (and I've a feeling that I *will* be
wrong): we're still missing some mechanism that prevents clients that do
not support fscrypt from creating new files in an encryption directory,
right?  I'm pretty sure I've discussed this "somewhere" with "someone",
but I can't remember anything else.

At this point, I can create an encrypted directory and, from a different
client (that doesn't support fscrypt), create a new non-encrypted file in
that directory.  The result isn't good, of course.

I guess that a new feature bit can be used so that the MDS won't allow any
sort of operations (or, at least, write/create operations) on encrypted
dirs from clients that don't have this bit set.

So, am I missing something or is this still on the TODO list?

(I can try to have a look at it if this is still missing.)

Cheers,
Jeff Layton Feb. 14, 2022, 6:39 p.m. UTC | #10
On Mon, 2022-02-14 at 17:57 +0000, Luís Henriques wrote:
> Jeff Layton <jlayton@kernel.org> writes:
> 
> > This patchset represents a (mostly) complete rough draft of fscrypt
> > support for cephfs. The context, filename and symlink support is more or
> > less the same as the versions posted before, and comprise the first half
> > of the patches.
> > 
> > The new bits here are the size handling changes and support for content
> > encryption, in buffered, direct and synchronous codepaths. Much of this
> > code is still very rough and needs a lot of cleanup work.
> > 
> > fscrypt support relies on some MDS changes that are being tracked here:
> > 
> >     https://github.com/ceph/ceph/pull/43588
> > 
> 
> Please correct me if I'm wrong (and I've a feeling that I *will* be
> wrong): we're still missing some mechanism that prevents clients that do
> not support fscrypt from creating new files in an encryption directory,
> right?  I'm pretty sure I've discussed this "somewhere" with "someone",
> but I can't remember anything else.
> 
> At this point, I can create an encrypted directory and, from a different
> client (that doesn't support fscrypt), create a new non-encrypted file in
> that directory.  The result isn't good, of course.
> 
> I guess that a new feature bit can be used so that the MDS won't allow any
> sort of operations (or, at least, write/create operations) on encrypted
> dirs from clients that don't have this bit set.
> 
> So, am I missing something or is this still on the TODO list?
> 
> (I can try to have a look at it if this is still missing.)
> 
> Cheers,

It's still on the TODO list.

Basically, I think we'll want to allow non-fscrypt-enabled clients to
stat and readdir in an fscrypt-enabled directory tree, and unlink files
and directories in it.

They should have no need to do anything else. You can't run backups from
such clients since you wouldn't have the real size or crypto context.
Luis Henriques Feb. 14, 2022, 9 p.m. UTC | #11
Jeff Layton <jlayton@kernel.org> writes:

> On Mon, 2022-02-14 at 17:57 +0000, Luís Henriques wrote:
>> Jeff Layton <jlayton@kernel.org> writes:
>> 
>> > This patchset represents a (mostly) complete rough draft of fscrypt
>> > support for cephfs. The context, filename and symlink support is more or
>> > less the same as the versions posted before, and comprise the first half
>> > of the patches.
>> > 
>> > The new bits here are the size handling changes and support for content
>> > encryption, in buffered, direct and synchronous codepaths. Much of this
>> > code is still very rough and needs a lot of cleanup work.
>> > 
>> > fscrypt support relies on some MDS changes that are being tracked here:
>> > 
>> >     https://github.com/ceph/ceph/pull/43588
>> > 
>> 
>> Please correct me if I'm wrong (and I've a feeling that I *will* be
>> wrong): we're still missing some mechanism that prevents clients that do
>> not support fscrypt from creating new files in an encryption directory,
>> right?  I'm pretty sure I've discussed this "somewhere" with "someone",
>> but I can't remember anything else.
>> 
>> At this point, I can create an encrypted directory and, from a different
>> client (that doesn't support fscrypt), create a new non-encrypted file in
>> that directory.  The result isn't good, of course.
>> 
>> I guess that a new feature bit can be used so that the MDS won't allow any
>> sort of operations (or, at least, write/create operations) on encrypted
>> dirs from clients that don't have this bit set.
>> 
>> So, am I missing something or is this still on the TODO list?
>> 
>> (I can try to have a look at it if this is still missing.)
>> 
>> Cheers,
>
> It's still on the TODO list.
>
> Basically, I think we'll want to allow non-fscrypt-enabled clients to
> stat and readdir in an fscrypt-enabled directory tree, and unlink files
> and directories in it.
>
> They should have no need to do anything else. You can't run backups from
> such clients since you wouldn't have the real size or crypto context.

Yep, that makes sense.  And do you think that adding a new feature bit is
the best way to sort this out, or did you had other solution in mind?

I'll try to spend some time on this tomorrow.

Cheers,
Jeff Layton Feb. 14, 2022, 9:10 p.m. UTC | #12
On Mon, 2022-02-14 at 21:00 +0000, Luís Henriques wrote:
> Jeff Layton <jlayton@kernel.org> writes:
> 
> > On Mon, 2022-02-14 at 17:57 +0000, Luís Henriques wrote:
> > > Jeff Layton <jlayton@kernel.org> writes:
> > > 
> > > > This patchset represents a (mostly) complete rough draft of fscrypt
> > > > support for cephfs. The context, filename and symlink support is more or
> > > > less the same as the versions posted before, and comprise the first half
> > > > of the patches.
> > > > 
> > > > The new bits here are the size handling changes and support for content
> > > > encryption, in buffered, direct and synchronous codepaths. Much of this
> > > > code is still very rough and needs a lot of cleanup work.
> > > > 
> > > > fscrypt support relies on some MDS changes that are being tracked here:
> > > > 
> > > >     https://github.com/ceph/ceph/pull/43588
> > > > 
> > > 
> > > Please correct me if I'm wrong (and I've a feeling that I *will* be
> > > wrong): we're still missing some mechanism that prevents clients that do
> > > not support fscrypt from creating new files in an encryption directory,
> > > right?  I'm pretty sure I've discussed this "somewhere" with "someone",
> > > but I can't remember anything else.
> > > 
> > > At this point, I can create an encrypted directory and, from a different
> > > client (that doesn't support fscrypt), create a new non-encrypted file in
> > > that directory.  The result isn't good, of course.
> > > 
> > > I guess that a new feature bit can be used so that the MDS won't allow any
> > > sort of operations (or, at least, write/create operations) on encrypted
> > > dirs from clients that don't have this bit set.
> > > 
> > > So, am I missing something or is this still on the TODO list?
> > > 
> > > (I can try to have a look at it if this is still missing.)
> > > 
> > > Cheers,
> > 
> > It's still on the TODO list.
> > 
> > Basically, I think we'll want to allow non-fscrypt-enabled clients to
> > stat and readdir in an fscrypt-enabled directory tree, and unlink files
> > and directories in it.
> > 
> > They should have no need to do anything else. You can't run backups from
> > such clients since you wouldn't have the real size or crypto context.
> 
> Yep, that makes sense.  And do you think that adding a new feature bit is
> the best way to sort this out, or did you had other solution in mind?
> 
> I'll try to spend some time on this tomorrow.
> 

Probably a new cephfs feature bit is fine, and indeed we already have
one for fscrypt anyway. This can probably share a value with the
ALTERNATE_NAME bit.
Xiubo Li Feb. 15, 2022, 12:44 a.m. UTC | #13
On 2/14/22 8:08 PM, Xiubo Li wrote:
>
> On 2/14/22 7:33 PM, Jeff Layton wrote:
>> On Mon, 2022-02-14 at 17:37 +0800, Xiubo Li wrote:
>>> Hi Jeff,
>>>
>>> I am using the 'wip-fscrypt' branch to test other issue and hit:
>>>
>>> cp: cannot access './dir___683': No buffer space available
>>> cp: cannot access './dir___686': No buffer space available
>>> cp: cannot access './dir___687': No buffer space available
>>> cp: cannot access './dir___688': No buffer space available
>>> cp: cannot access './dir___689': No buffer space available
>>> cp: cannot access './dir___693': No buffer space available
>>>
>>> ...
>>>
>>> [root@lxbceph1 kcephfs]# diff ./dir___997 /data/backup/kernel/dir___997
>>> diff: ./dir___997: No buffer space available
>>>
>>>
>>> The dmesg logs:
>>>
>>> <7>[ 1256.918228] ceph:  do_getattr inode 0000000089964a71 mask AsXsFs
>>> mode 040755
>>> <7>[ 1256.918232] ceph:  __ceph_caps_issued_mask ino 0x100000009be cap
>>> 0000000014f1c64b issued pAsLsXsFs (mask AsXsFs)
>>> <7>[ 1256.918237] ceph:  __touch_cap 0000000089964a71 cap
>>> 0000000014f1c64b mds0
>>> <7>[ 1256.918250] ceph:  readdir 0000000089964a71 file 00000000065cb689
>>> pos 0
>>> <7>[ 1256.918254] ceph:  readdir off 0 -> '.'
>>> <7>[ 1256.918258] ceph:  readdir off 1 -> '..'
>>> <4>[ 1256.918262] fscrypt (ceph, inode 1099511630270): Error -105
>>> getting encryption context
>>> <7>[ 1256.918269] ceph:  readdir 0000000089964a71 file 00000000065cb689
>>> pos 2
>>> <4>[ 1256.918273] fscrypt (ceph, inode 1099511630270): Error -105
>>> getting encryption context
>>> <7>[ 1256.918288] ceph:  release inode 0000000089964a71 dir file
>>> 00000000065cb689
>>> <7>[ 1256.918310] ceph:  __ceph_caps_issued_mask ino 0x1 cap
>>> 00000000aa2afb8b issued pAsLsXsFs (mask Fs)
>>> <7>[ 1257.574593] ceph:  mdsc delayed_work
>>>
>>> I did nothing about the fscrypt after mounting the kclient, just create
>>> 2000 directories and then made some snapshots on the root dir and then
>>> try to copy the root directory to the backup.
>>>
>>> - Xiubo
>>>
>> That means that ceph_crypt_get_context returned -ENODATA, which it can
>
> It should be -ENOBUFS.
>
> I am not sure it relates to the snapshot stuff. I will try without the 
> snapshot later.

Without snapshot, I also can see this error.

-- Xiubo

>
> I can debug it later, maybe in next week.
>
> -- Xiubo
>
>> do for several different reasons. We probably need to add in some
>> debugging there to see which one it is...
>>
>> TBH, I've done absolutely no testing with snapshots, so it's quite
>> possible there is some interaction there that is causing problems.
>>
>> -- Jeff
>>
>>> On 1/12/22 3:15 AM, Jeff Layton wrote:
>>>> This patchset represents a (mostly) complete rough draft of fscrypt
>>>> support for cephfs. The context, filename and symlink support is 
>>>> more or
>>>> less the same as the versions posted before, and comprise the first 
>>>> half
>>>> of the patches.
>>>>
>>>> The new bits here are the size handling changes and support for 
>>>> content
>>>> encryption, in buffered, direct and synchronous codepaths. Much of 
>>>> this
>>>> code is still very rough and needs a lot of cleanup work.
>>>>
>>>> fscrypt support relies on some MDS changes that are being tracked 
>>>> here:
>>>>
>>>>       https://github.com/ceph/ceph/pull/43588
>>>>
>>>> In particular, this PR adds some new opaque fields in the inode 
>>>> that we
>>>> use to store fscrypt-specific information, like the context and the 
>>>> real
>>>> size of a file. That is slated to be merged for the upcoming Quincy
>>>> release (which is sometime this northern spring).
>>>>
>>>> There are still some notable bugs:
>>>>
>>>> 1/ we've identified a few more potential races in truncate handling
>>>> which will probably necessitate a protocol change, as well as 
>>>> changes to
>>>> the MDS and kclient patchsets. The good news is that we think we have
>>>> an approach that will resolve this.
>>>>
>>>> 2/ the kclient doesn't handle reading sparse regions in OSD objects
>>>> properly yet. The client can end up writing to a non-zero offset in a
>>>> non-existent object. Then, if the client tries to read the written
>>>> region back later, it'll get back zeroes and give you garbage when you
>>>> try to decrypt them.
>>>>
>>>> It turns out that the OSD already supports a SPARSE_READ operation, so
>>>> I'm working on implementing that in the kclient to make it not try to
>>>> decrypt the sparse regions.
>>>>
>>>> Still, I was able to run xfstests on this set yesterday. Bug #2 above
>>>> prevented all of the tests from passing, but it didn't oops! I call 
>>>> that
>>>> progress! Given that, I figured this is a good time to post what I 
>>>> have
>>>> so far.
>>>>
>>>> Note that the buffered I/O changes in this set are not suitable for
>>>> merge and will likely end up being discarded. We need to plumb the
>>>> encryption in at the netfs layer, so that we can store encrypted data
>>>> in fscache.
>>>>
>>>> The non-buffered codepaths will likely also need substantial changes
>>>> before merging. It may be simpler to just move that into the netfs 
>>>> layer
>>>> too as cifs will need something similar anyway.
>>>>
>>>> My goal is to get most of this into v5.18, but v5.19 might be more
>>>> realistiv. Hopefully I'll have a non-RFC patchset to send in a few
>>>> weeks.
>>>>
>>>> Special thanks to Xiubo who came through with the MDS patches. Also,
>>>> thanks to everyone (especially Eric Biggers) for all of the previous
>>>> reviews. It's much appreciated!
>>>>
>>>> Jeff Layton (43):
>>>>     vfs: export new_inode_pseudo
>>>>     fscrypt: export fscrypt_base64url_encode and 
>>>> fscrypt_base64url_decode
>>>>     fscrypt: export fscrypt_fname_encrypt and 
>>>> fscrypt_fname_encrypted_size
>>>>     fscrypt: add fscrypt_context_for_new_inode
>>>>     ceph: preallocate inode for ops that may create one
>>>>     ceph: crypto context handling for ceph
>>>>     ceph: parse new fscrypt_auth and fscrypt_file fields in inode 
>>>> traces
>>>>     ceph: add fscrypt_* handling to caps.c
>>>>     ceph: add ability to set fscrypt_auth via setattr
>>>>     ceph: implement -o test_dummy_encryption mount option
>>>>     ceph: decode alternate_name in lease info
>>>>     ceph: add fscrypt ioctls
>>>>     ceph: make ceph_msdc_build_path use ref-walk
>>>>     ceph: add encrypted fname handling to ceph_mdsc_build_path
>>>>     ceph: send altname in MClientRequest
>>>>     ceph: encode encrypted name in dentry release
>>>>     ceph: properly set DCACHE_NOKEY_NAME flag in lookup
>>>>     ceph: make d_revalidate call fscrypt revalidator for encrypted
>>>>       dentries
>>>>     ceph: add helpers for converting names for userland presentation
>>>>     ceph: add fscrypt support to ceph_fill_trace
>>>>     ceph: add support to readdir for encrypted filenames
>>>>     ceph: create symlinks with encrypted and base64-encoded targets
>>>>     ceph: make ceph_get_name decrypt filenames
>>>>     ceph: add a new ceph.fscrypt.auth vxattr
>>>>     ceph: add some fscrypt guardrails
>>>>     libceph: add CEPH_OSD_OP_ASSERT_VER support
>>>>     ceph: size handling for encrypted inodes in cap updates
>>>>     ceph: fscrypt_file field handling in MClientRequest messages
>>>>     ceph: get file size from fscrypt_file when present in inode traces
>>>>     ceph: handle fscrypt fields in cap messages from MDS
>>>>     ceph: add infrastructure for file encryption and decryption
>>>>     libceph: allow ceph_osdc_new_request to accept a multi-op read
>>>>     ceph: disable fallocate for encrypted inodes
>>>>     ceph: disable copy offload on encrypted inodes
>>>>     ceph: don't use special DIO path for encrypted inodes
>>>>     ceph: set encryption context on open
>>>>     ceph: align data in pages in ceph_sync_write
>>>>     ceph: add read/modify/write to ceph_sync_write
>>>>     ceph: plumb in decryption during sync reads
>>>>     ceph: set i_blkbits to crypto block size for encrypted inodes
>>>>     ceph: add fscrypt decryption support to ceph_netfs_issue_op
>>>>     ceph: add encryption support to writepage
>>>>     ceph: fscrypt support for writepages
>>>>
>>>> Luis Henriques (1):
>>>>     ceph: don't allow changing layout on encrypted files/directories
>>>>
>>>> Xiubo Li (4):
>>>>     ceph: add __ceph_get_caps helper support
>>>>     ceph: add __ceph_sync_read helper support
>>>>     ceph: add object version support for sync read
>>>>     ceph: add truncate size handling support for fscrypt
>>>>
>>>>    fs/ceph/Makefile                |   1 +
>>>>    fs/ceph/acl.c                   |   4 +-
>>>>    fs/ceph/addr.c                  | 128 +++++--
>>>>    fs/ceph/caps.c                  | 211 ++++++++++--
>>>>    fs/ceph/crypto.c                | 374 +++++++++++++++++++++
>>>>    fs/ceph/crypto.h                | 237 +++++++++++++
>>>>    fs/ceph/dir.c                   | 209 +++++++++---
>>>>    fs/ceph/export.c                |  44 ++-
>>>>    fs/ceph/file.c                  | 476 +++++++++++++++++++++-----
>>>>    fs/ceph/inode.c                 | 576 
>>>> +++++++++++++++++++++++++++++---
>>>>    fs/ceph/ioctl.c                 |  87 +++++
>>>>    fs/ceph/mds_client.c            | 349 ++++++++++++++++---
>>>>    fs/ceph/mds_client.h            |  24 +-
>>>>    fs/ceph/super.c                 |  90 ++++-
>>>>    fs/ceph/super.h                 |  43 ++-
>>>>    fs/ceph/xattr.c                 |  29 ++
>>>>    fs/crypto/fname.c               |  44 ++-
>>>>    fs/crypto/fscrypt_private.h     |   9 +-
>>>>    fs/crypto/hooks.c               |   6 +-
>>>>    fs/crypto/policy.c              |  35 +-
>>>>    fs/inode.c                      |   1 +
>>>>    include/linux/ceph/ceph_fs.h    |  21 +-
>>>>    include/linux/ceph/osd_client.h |   6 +-
>>>>    include/linux/ceph/rados.h      |   4 +
>>>>    include/linux/fscrypt.h         |  10 +
>>>>    net/ceph/osd_client.c           |  32 +-
>>>>    26 files changed, 2700 insertions(+), 350 deletions(-)
>>>>    create mode 100644 fs/ceph/crypto.c
>>>>    create mode 100644 fs/ceph/crypto.h
>>>>
Luis Henriques Feb. 16, 2022, 4:13 p.m. UTC | #14
Jeff Layton <jlayton@kernel.org> writes:

> On Mon, 2022-02-14 at 17:57 +0000, Luís Henriques wrote:
>> Jeff Layton <jlayton@kernel.org> writes:
>> 
>> > This patchset represents a (mostly) complete rough draft of fscrypt
>> > support for cephfs. The context, filename and symlink support is more or
>> > less the same as the versions posted before, and comprise the first half
>> > of the patches.
>> > 
>> > The new bits here are the size handling changes and support for content
>> > encryption, in buffered, direct and synchronous codepaths. Much of this
>> > code is still very rough and needs a lot of cleanup work.
>> > 
>> > fscrypt support relies on some MDS changes that are being tracked here:
>> > 
>> >     https://github.com/ceph/ceph/pull/43588
>> > 
>> 
>> Please correct me if I'm wrong (and I've a feeling that I *will* be
>> wrong): we're still missing some mechanism that prevents clients that do
>> not support fscrypt from creating new files in an encryption directory,
>> right?  I'm pretty sure I've discussed this "somewhere" with "someone",
>> but I can't remember anything else.
>> 
>> At this point, I can create an encrypted directory and, from a different
>> client (that doesn't support fscrypt), create a new non-encrypted file in
>> that directory.  The result isn't good, of course.
>> 
>> I guess that a new feature bit can be used so that the MDS won't allow any
>> sort of operations (or, at least, write/create operations) on encrypted
>> dirs from clients that don't have this bit set.
>> 
>> So, am I missing something or is this still on the TODO list?
>> 
>> (I can try to have a look at it if this is still missing.)
>> 
>> Cheers,
>
> It's still on the TODO list.
>
> Basically, I think we'll want to allow non-fscrypt-enabled clients to
> stat and readdir in an fscrypt-enabled directory tree, and unlink files
> and directories in it.
>
> They should have no need to do anything else. You can't run backups from
> such clients since you wouldn't have the real size or crypto context.
> -- 
> Jeff Layton <jlayton@kernel.org>

OK, I've looked at the code and I've a patch that works (sort of).  Here's
what I've done:

I'm blocking all the dangerous Ops (CEPH_MDS_OP_{CREATE,MKDIR,...}) early
in the client requests handling code.  I.e., returning -EROFS if the
client session doesn't have the feature *and* the inode has fscrypt_auth
set.

It sort of works (I still need to find if I need any locks, that's black
magic for me!), but it won't prevent a client from doing things like
appending garbage to an encrypted file.  Doing this will obviously make
that file useless, but it's not that much different from non-encrypted
files (sure, in this case it might be possible to recover some data).  But
I'm not seeing an easy way to caps into this mix.

Cheers,