diff mbox series

[RFC,v3,12/16] ceph: add encrypted fname handling to ceph_mdsc_build_path

Message ID 20200914191707.380444-13-jlayton@kernel.org (mailing list archive)
State Not Applicable
Headers show
Series ceph+fscrypt: context, filename and symlink support | expand

Commit Message

Jeff Layton Sept. 14, 2020, 7:17 p.m. UTC
Allow ceph_mdsc_build_path to encrypt and base64 encode the filename
when the parent is encrypted and we're sending the path to the MDS.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/mds_client.c | 70 ++++++++++++++++++++++++++++++++++----------
 1 file changed, 54 insertions(+), 16 deletions(-)

Comments

Eric Biggers Sept. 15, 2020, 1:41 a.m. UTC | #1
On Mon, Sep 14, 2020 at 03:17:03PM -0400, Jeff Layton wrote:
> +		} else {
> +			int err;
> +			struct fscrypt_name fname = { };
> +			int len;
> +			char buf[FSCRYPT_BASE64_CHARS(NAME_MAX)];
> +
> +			dget(parent);
> +			spin_unlock(&cur->d_lock);
> +
> +			err = fscrypt_setup_filename(d_inode(parent), &cur->d_name, 1, &fname);
> +			if (err) {
> +				dput(parent);
> +				dput(cur);
> +				return ERR_PTR(err);
> +			}

It's still not clear how no-key names are handled here (or if they are even
possible here).

> +
> +			/* base64 encode the encrypted name */
> +			len = fscrypt_base64_encode(fname.disk_name.name, fname.disk_name.len, buf);
> +			pos -= len;
> +			if (pos < 0) {
> +				dput(parent);
> +				fscrypt_free_filename(&fname);
> +				break;
> +			}
> +			memcpy(path + pos, buf, len);
> +			dout("non-ciphertext name = %.*s\n", len, buf);
> +			fscrypt_free_filename(&fname);

This says "non-ciphertext name", which suggest that it's a plaintext name.  But
actually it's a base64-encoded ciphertext name.

- Eric
Jeff Layton Sept. 16, 2020, 12:30 p.m. UTC | #2
On Mon, 2020-09-14 at 18:41 -0700, Eric Biggers wrote:
> On Mon, Sep 14, 2020 at 03:17:03PM -0400, Jeff Layton wrote:
> > +		} else {
> > +			int err;
> > +			struct fscrypt_name fname = { };
> > +			int len;
> > +			char buf[FSCRYPT_BASE64_CHARS(NAME_MAX)];
> > +
> > +			dget(parent);
> > +			spin_unlock(&cur->d_lock);
> > +
> > +			err = fscrypt_setup_filename(d_inode(parent), &cur->d_name, 1, &fname);
> > +			if (err) {
> > +				dput(parent);
> > +				dput(cur);
> > +				return ERR_PTR(err);
> > +			}
> 
> It's still not clear how no-key names are handled here (or if they are even
> possible here).
> 

They're not really handled yet. We need support in the MDS for it, which
is being worked on by Xiubo (cc'ed):

    https://tracker.ceph.com/issues/47162

For now, working with names > ~149 characters can leave you with bad
dentries that the client may not be able to work with if you don't have
the key.

It sounds like we'll probably need to stabilize some version of the
nokey name so that we can allow the MDS to look them up. Would it be a
problem for us to use the current version of the nokey name format for
this, or would it be better to come up with some other distinct format
for this?

Using the current version of the nokey name is simple as we can just
pass it as-is to the MDS if someone is working in a directory w/o keys.

> > +
> > +			/* base64 encode the encrypted name */
> > +			len = fscrypt_base64_encode(fname.disk_name.name, fname.disk_name.len, buf);
> > +			pos -= len;
> > +			if (pos < 0) {
> > +				dput(parent);
> > +				fscrypt_free_filename(&fname);
> > +				break;
> > +			}
> > +			memcpy(path + pos, buf, len);
> > +			dout("non-ciphertext name = %.*s\n", len, buf);
> > +			fscrypt_free_filename(&fname);
> 
> This says "non-ciphertext name", which suggest that it's a plaintext name.  But
> actually it's a base64-encoded ciphertext name.
> 

Thanks. I fixed the comment.
Eric Biggers Sept. 16, 2020, 5:36 p.m. UTC | #3
On Wed, Sep 16, 2020 at 08:30:01AM -0400, Jeff Layton wrote:
> 
> It sounds like we'll probably need to stabilize some version of the
> nokey name so that we can allow the MDS to look them up. Would it be a
> problem for us to use the current version of the nokey name format for
> this, or would it be better to come up with some other distinct format
> for this?
> 

You could use the current version, with the dirhash field changed from u32 to
__le32 so that it doesn't depend on CPU endianness.  But you should also
consider using just base64(SHA256(filename)).  The SHA256(filename) approach
wouldn't include a dirhash, and it would handle short filenames less
efficiently.  However, it would be simpler.  Would it be any easier for you?

I'm not sure which would be better from a fs/crypto/ perspective.  For *now*, it
would be easier if you just used the current 'struct fscrypt_nokey_name'.
However, anything you use would be set in stone, whereas as-is the format can be
changed at any time.  In fact, we changed it recently; see commit edc440e3d27f.

If we happen to change the nokey name in the future for local filesystems (say,
to use BLAKE2 instead of SHA256, or to support longer dirhashes), then it would
be easier if the stable format were just SHA256(filename).

It's not a huge deal though.  So if e.g. you like that the current format avoids
the cryptographic hash for the vast majority of filenames, and if you're fine
with the slightly increased complexity, you can just use it.

- Eric
Jeff Layton Sept. 16, 2020, 6:04 p.m. UTC | #4
On Wed, 2020-09-16 at 10:36 -0700, Eric Biggers wrote:
> On Wed, Sep 16, 2020 at 08:30:01AM -0400, Jeff Layton wrote:
> > It sounds like we'll probably need to stabilize some version of the
> > nokey name so that we can allow the MDS to look them up. Would it be a
> > problem for us to use the current version of the nokey name format for
> > this, or would it be better to come up with some other distinct format
> > for this?
> > 
> 
> You could use the current version, with the dirhash field changed from u32 to
> __le32 so that it doesn't depend on CPU endianness.  But you should also
> consider using just base64(SHA256(filename)).  The SHA256(filename) approach
> wouldn't include a dirhash, and it would handle short filenames less
> efficiently.  However, it would be simpler.  Would it be any easier for you?
> 
> I'm not sure which would be better from a fs/crypto/ perspective.  For *now*, it
> would be easier if you just used the current 'struct fscrypt_nokey_name'.
> However, anything you use would be set in stone, whereas as-is the format can be
> changed at any time.  In fact, we changed it recently; see commit edc440e3d27f.
> 
> If we happen to change the nokey name in the future for local filesystems (say,
> to use BLAKE2 instead of SHA256, or to support longer dirhashes), then it would
> be easier if the stable format were just SHA256(filename).
> 
> It's not a huge deal though.  So if e.g. you like that the current format avoids
> the cryptographic hash for the vast majority of filenames, and if you're fine
> with the slightly increased complexity, you can just use it.
> 

The problem with using a different scheme from the presentation format
is this:

Suppose I don't have the key for a directory and do a readdir() in
there, and get back a nokey name with the hash at the end. A little
while later, the dentry gets evicted from the cache.

Userland then comes back and wants to do something with that dentry
(maybe an unlink or stat). Now I have to look it up. At that point, I
don't really have a way to resolve that on the client [1]. I have to ask
the server to do it. What do I ask it to look up?

Storing the stable format as a full SHA256 hash of the name is
problematic as I don't think we can convert the nokey name to it
directly (can we?).

If we store the current nokey format (or some variant of it that doesn't
include the dirhash fields) then we should be able to look up the
dentry, even when we don't have complete dir contents.
Eric Biggers Sept. 16, 2020, 6:42 p.m. UTC | #5
On Wed, Sep 16, 2020 at 02:04:23PM -0400, Jeff Layton wrote:
> On Wed, 2020-09-16 at 10:36 -0700, Eric Biggers wrote:
> > On Wed, Sep 16, 2020 at 08:30:01AM -0400, Jeff Layton wrote:
> > > It sounds like we'll probably need to stabilize some version of the
> > > nokey name so that we can allow the MDS to look them up. Would it be a
> > > problem for us to use the current version of the nokey name format for
> > > this, or would it be better to come up with some other distinct format
> > > for this?
> > > 
> > 
> > You could use the current version, with the dirhash field changed from u32 to
> > __le32 so that it doesn't depend on CPU endianness.  But you should also
> > consider using just base64(SHA256(filename)).  The SHA256(filename) approach
> > wouldn't include a dirhash, and it would handle short filenames less
> > efficiently.  However, it would be simpler.  Would it be any easier for you?
> > 
> > I'm not sure which would be better from a fs/crypto/ perspective.  For *now*, it
> > would be easier if you just used the current 'struct fscrypt_nokey_name'.
> > However, anything you use would be set in stone, whereas as-is the format can be
> > changed at any time.  In fact, we changed it recently; see commit edc440e3d27f.
> > 
> > If we happen to change the nokey name in the future for local filesystems (say,
> > to use BLAKE2 instead of SHA256, or to support longer dirhashes), then it would
> > be easier if the stable format were just SHA256(filename).
> > 
> > It's not a huge deal though.  So if e.g. you like that the current format avoids
> > the cryptographic hash for the vast majority of filenames, and if you're fine
> > with the slightly increased complexity, you can just use it.
> > 
> 
> The problem with using a different scheme from the presentation format
> is this:
> 
> Suppose I don't have the key for a directory and do a readdir() in
> there, and get back a nokey name with the hash at the end. A little
> while later, the dentry gets evicted from the cache.
> 
> Userland then comes back and wants to do something with that dentry
> (maybe an unlink or stat). Now I have to look it up. At that point, I
> don't really have a way to resolve that on the client [1]. I have to ask
> the server to do it. What do I ask it to look up?
> 
> Storing the stable format as a full SHA256 hash of the name is
> problematic as I don't think we can convert the nokey name to it
> directly (can we?).
> 
> If we store the current nokey format (or some variant of it that doesn't
> include the dirhash fields) then we should be able to look up the
> dentry, even when we don't have complete dir contents.
> -- 
> Jeff Layton <jlayton@kernel.org>
> 
> [1]: ok, technically we could do a readdir in the directory and try to
> match the nokey name by deriving them from the full crypttext, but
> that's potentially _very_ expensive if the dir is large.

You'd need to use the same format for storage and presentation.

My point is that other filesystems don't have that constraint, and it could
happen that we decide to change the presentation format for those *other*
filesystems in the future.  Say, if SHA-256 falls out of favor and people want
it replaced with a different cryptographic hash algorithm; or if a filesystem
with 128-bit dirhashes adds support for fscrypt; or if it turns out that a
different variant of base64 would be better.

The ceph format would then be a "legacy" format that we'd need to support.  That
would be somewhat easier if it was simply base64(SHA-256(filename)), vs.
something more complicated.  Again, not a huge deal though, and maybe you want
to avoid doing the hash for short filenames anyway.

- Eric
diff mbox series

Patch

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index e3dc061252d4..7eb504170981 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -2314,18 +2314,27 @@  static inline  u64 __get_oldest_tid(struct ceph_mds_client *mdsc)
 	return mdsc->oldest_tid;
 }
 
-/*
- * Build a dentry's path.  Allocate on heap; caller must kfree.  Based
- * on build_path_from_dentry in fs/cifs/dir.c.
+/**
+ * ceph_mdsc_build_path - build a path string to a given dentry
+ * @dentry: dentry to which path should be built
+ * @plen: returned length of string
+ * @pbase: returned base inode number
+ * @for_wire: is this path going to be sent to the MDS?
+ *
+ * Build a string that represents the path to the dentry. This is mostly called
+ * for two different purposes:
+ *
+ * 1) we need to build a path string to send to the MDS (for_wire == true)
+ * 2) we need a path string for local presentation (e.g. debugfs) (for_wire == false)
  *
- * If @stop_on_nosnap, generate path relative to the first non-snapped
- * inode.
+ * The path is built in reverse, starting with the dentry. Walk back up toward
+ * the root, building the path until the first non-snapped inode is reached (for_wire)
+ * or the root inode is reached (!for_wire).
  *
  * Encode hidden .snap dirs as a double /, i.e.
  *   foo/.snap/bar -> foo//bar
  */
-char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *pbase,
-			   int stop_on_nosnap)
+char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *pbase, int for_wire)
 {
 	struct dentry *cur;
 	struct inode *inode;
@@ -2347,30 +2356,59 @@  char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *pbase,
 	seq = read_seqbegin(&rename_lock);
 	cur = dget(dentry);
 	for (;;) {
-		struct dentry *temp;
+		struct dentry *parent;
 
 		spin_lock(&cur->d_lock);
 		inode = d_inode(cur);
+		parent = cur->d_parent;
 		if (inode && ceph_snap(inode) == CEPH_SNAPDIR) {
 			dout("build_path path+%d: %p SNAPDIR\n",
 			     pos, cur);
-		} else if (stop_on_nosnap && inode && dentry != cur &&
-			   ceph_snap(inode) == CEPH_NOSNAP) {
+			dget(parent);
+			spin_unlock(&cur->d_lock);
+		} else if (for_wire && inode && dentry != cur && ceph_snap(inode) == CEPH_NOSNAP) {
 			spin_unlock(&cur->d_lock);
 			pos++; /* get rid of any prepended '/' */
 			break;
-		} else {
+		} else if (!for_wire || !IS_ENCRYPTED(d_inode(parent))) {
 			pos -= cur->d_name.len;
 			if (pos < 0) {
 				spin_unlock(&cur->d_lock);
 				break;
 			}
 			memcpy(path + pos, cur->d_name.name, cur->d_name.len);
+			dget(parent);
+			spin_unlock(&cur->d_lock);
+		} else {
+			int err;
+			struct fscrypt_name fname = { };
+			int len;
+			char buf[FSCRYPT_BASE64_CHARS(NAME_MAX)];
+
+			dget(parent);
+			spin_unlock(&cur->d_lock);
+
+			err = fscrypt_setup_filename(d_inode(parent), &cur->d_name, 1, &fname);
+			if (err) {
+				dput(parent);
+				dput(cur);
+				return ERR_PTR(err);
+			}
+
+			/* base64 encode the encrypted name */
+			len = fscrypt_base64_encode(fname.disk_name.name, fname.disk_name.len, buf);
+			pos -= len;
+			if (pos < 0) {
+				dput(parent);
+				fscrypt_free_filename(&fname);
+				break;
+			}
+			memcpy(path + pos, buf, len);
+			dout("non-ciphertext name = %.*s\n", len, buf);
+			fscrypt_free_filename(&fname);
 		}
-		temp = cur;
-		cur = dget(temp->d_parent);
-		spin_unlock(&temp->d_lock);
-		dput(temp);
+		dput(cur);
+		cur = parent;
 
 		/* Are we at the root? */
 		if (IS_ROOT(cur))
@@ -2415,7 +2453,7 @@  static int build_dentry_path(struct dentry *dentry, struct inode *dir,
 	rcu_read_lock();
 	if (!dir)
 		dir = d_inode_rcu(dentry->d_parent);
-	if (dir && parent_locked && ceph_snap(dir) == CEPH_NOSNAP) {
+	if (dir && parent_locked && ceph_snap(dir) == CEPH_NOSNAP && !IS_ENCRYPTED(dir)) {
 		*pino = ceph_ino(dir);
 		rcu_read_unlock();
 		*ppath = dentry->d_name.name;