ceph: fix potential use-after-free in ceph_mdsc_build_path
diff mbox series

Message ID 20190426211238.16970-1-jlayton@kernel.org
State New
Headers show
Series
  • ceph: fix potential use-after-free in ceph_mdsc_build_path
Related show

Commit Message

Jeff Layton April 26, 2019, 9:12 p.m. UTC
temp is not defined outside of the RCU critical section here. Ensure
we grab that value before we drop the rcu_read_lock.

Cc: stable@vger.kernel.org
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/mds_client.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

Al Viro April 26, 2019, 9:40 p.m. UTC | #1
On Fri, Apr 26, 2019 at 05:12:38PM -0400, Jeff Layton wrote:
> temp is not defined outside of the RCU critical section here. Ensure
> we grab that value before we drop the rcu_read_lock.

> +	base = ceph_ino(d_inode(temp));
>  	rcu_read_unlock();

Umm...  Freeing (including freeing the name) is postponed by holding
rcu_read_lock().  Children moving away + dentry going negative
is *not*.

What are you trying to return there, anyway?  Root or, in case of
stop_on_nosnap, CEPH_NOSNAP one you'd stepped into?

The latter I'd suggest to handle while under ->d_lock; the former
ought to be safe if it's fs root.  Details, please...

Another fun question is whether you can hit a disconnected subtree
from open-by-fhandle in process.  That might get uncomfortable,
since you'd get the tail of actual pathname and the length will
depend upon the timing.
Jeff Layton April 26, 2019, 11:16 p.m. UTC | #2
On Fri, 2019-04-26 at 22:40 +0100, Al Viro wrote:
> On Fri, Apr 26, 2019 at 05:12:38PM -0400, Jeff Layton wrote:
> > temp is not defined outside of the RCU critical section here. Ensure
> > we grab that value before we drop the rcu_read_lock.
> > +	base = ceph_ino(d_inode(temp));
> >  	rcu_read_unlock();
> 
> Umm...  Freeing (including freeing the name) is postponed by holding
> rcu_read_lock().  Children moving away + dentry going negative
> is *not*.
> 

Yep. This patch is just aimed at "let's prevent an oops". Whether the
resulting string makes any sense is another matter entirely.

> What are you trying to return there, anyway?  Root or, in case of
> stop_on_nosnap, CEPH_NOSNAP one you'd stepped into?
> 

Correct, AFAIU.

> The latter I'd suggest to handle while under ->d_lock; the former
> ought to be safe if it's fs root.  Details, please...


I'm still rather new to this code, but...basically stop_on_nonsnap is
always set, other than in two cases:

1/ printing pathnames (generally for debugfs)

2/ legacy protocol version handling

The first we don't really care too much about correctness in the face of
races, only that we have a consistent string of some sort. The second
might matter, but those really old protocol versions are not in wide
usage. Ceph has a lot of legacy cruft.

In any case, most of the calls that involve filenames end up getting a
parent inode number and a single dentry name. For snapshots you'll get a
non-snapped parent inode number (generally the one of the .snap
directory) and full path from there to the dentry. Snapshots are read-
only though so in principle that tree should be relatively static.

> Another fun question is whether you can hit a disconnected subtree
> from open-by-fhandle in process.  That might get uncomfortable,
> since you'd get the tail of actual pathname and the length will
> depend upon the timing.

Before it returns, ceph_mdsc_build_path does this:

if (pos != 0 || read_seqretry(&rename_lock, seq)) {
	kfree(path);
	goto retry;
}

If we end up with a path that has a different length than the one we
originally allocated, we'll do the whole thing over again.

I think the only place that's really a problem is when we're talking to
a legacy protocol server? Maybe one of the ceph maintainers can step in
and clarify this point.

Patch
diff mbox series

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 2b102bf5c356..74cb3078ea63 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -2087,13 +2087,14 @@  static inline  u64 __get_oldest_tid(struct ceph_mds_client *mdsc)
  * Encode hidden .snap dirs as a double /, i.e.
  *   foo/.snap/bar -> foo//bar
  */
-char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *base,
+char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *pbase,
 			   int stop_on_nosnap)
 {
 	struct dentry *temp;
 	char *path;
 	int len, pos;
 	unsigned seq;
+	u64 base;
 
 	if (!dentry)
 		return ERR_PTR(-EINVAL);
@@ -2149,6 +2150,7 @@  char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *base,
 			path[--pos] = '/';
 		temp = temp->d_parent;
 	}
+	base = ceph_ino(d_inode(temp));
 	rcu_read_unlock();
 	if (pos != 0 || read_seqretry(&rename_lock, seq)) {
 		pr_err("build_path did not end path lookup where "
@@ -2161,10 +2163,10 @@  char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *base,
 		goto retry;
 	}
 
-	*base = ceph_ino(d_inode(temp));
+	*pbase = base;
 	*plen = len;
 	dout("build_path on %p %d built %llx '%.*s'\n",
-	     dentry, d_count(dentry), *base, len, path);
+	     dentry, d_count(dentry), base, len, path);
 	return path;
 }