Message ID | 20190426211238.16970-1-jlayton@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ceph: fix potential use-after-free in ceph_mdsc_build_path | expand |
On Fri, Apr 26, 2019 at 05:12:38PM -0400, Jeff Layton wrote: > temp is not defined outside of the RCU critical section here. Ensure > we grab that value before we drop the rcu_read_lock. > + base = ceph_ino(d_inode(temp)); > rcu_read_unlock(); Umm... Freeing (including freeing the name) is postponed by holding rcu_read_lock(). Children moving away + dentry going negative is *not*. What are you trying to return there, anyway? Root or, in case of stop_on_nosnap, CEPH_NOSNAP one you'd stepped into? The latter I'd suggest to handle while under ->d_lock; the former ought to be safe if it's fs root. Details, please... Another fun question is whether you can hit a disconnected subtree from open-by-fhandle in process. That might get uncomfortable, since you'd get the tail of actual pathname and the length will depend upon the timing.
On Fri, 2019-04-26 at 22:40 +0100, Al Viro wrote: > On Fri, Apr 26, 2019 at 05:12:38PM -0400, Jeff Layton wrote: > > temp is not defined outside of the RCU critical section here. Ensure > > we grab that value before we drop the rcu_read_lock. > > + base = ceph_ino(d_inode(temp)); > > rcu_read_unlock(); > > Umm... Freeing (including freeing the name) is postponed by holding > rcu_read_lock(). Children moving away + dentry going negative > is *not*. > Yep. This patch is just aimed at "let's prevent an oops". Whether the resulting string makes any sense is another matter entirely. > What are you trying to return there, anyway? Root or, in case of > stop_on_nosnap, CEPH_NOSNAP one you'd stepped into? > Correct, AFAIU. > The latter I'd suggest to handle while under ->d_lock; the former > ought to be safe if it's fs root. Details, please... I'm still rather new to this code, but...basically stop_on_nonsnap is always set, other than in two cases: 1/ printing pathnames (generally for debugfs) 2/ legacy protocol version handling The first we don't really care too much about correctness in the face of races, only that we have a consistent string of some sort. The second might matter, but those really old protocol versions are not in wide usage. Ceph has a lot of legacy cruft. In any case, most of the calls that involve filenames end up getting a parent inode number and a single dentry name. For snapshots you'll get a non-snapped parent inode number (generally the one of the .snap directory) and full path from there to the dentry. Snapshots are read- only though so in principle that tree should be relatively static. > Another fun question is whether you can hit a disconnected subtree > from open-by-fhandle in process. That might get uncomfortable, > since you'd get the tail of actual pathname and the length will > depend upon the timing. Before it returns, ceph_mdsc_build_path does this: if (pos != 0 || read_seqretry(&rename_lock, seq)) { kfree(path); goto retry; } If we end up with a path that has a different length than the one we originally allocated, we'll do the whole thing over again. I think the only place that's really a problem is when we're talking to a legacy protocol server? Maybe one of the ceph maintainers can step in and clarify this point.
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 2b102bf5c356..74cb3078ea63 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -2087,13 +2087,14 @@ static inline u64 __get_oldest_tid(struct ceph_mds_client *mdsc) * Encode hidden .snap dirs as a double /, i.e. * foo/.snap/bar -> foo//bar */ -char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *base, +char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *pbase, int stop_on_nosnap) { struct dentry *temp; char *path; int len, pos; unsigned seq; + u64 base; if (!dentry) return ERR_PTR(-EINVAL); @@ -2149,6 +2150,7 @@ char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *base, path[--pos] = '/'; temp = temp->d_parent; } + base = ceph_ino(d_inode(temp)); rcu_read_unlock(); if (pos != 0 || read_seqretry(&rename_lock, seq)) { pr_err("build_path did not end path lookup where " @@ -2161,10 +2163,10 @@ char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *base, goto retry; } - *base = ceph_ino(d_inode(temp)); + *pbase = base; *plen = len; dout("build_path on %p %d built %llx '%.*s'\n", - dentry, d_count(dentry), *base, len, path); + dentry, d_count(dentry), base, len, path); return path; }
temp is not defined outside of the RCU critical section here. Ensure we grab that value before we drop the rcu_read_lock. Cc: stable@vger.kernel.org Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jeff Layton <jlayton@kernel.org> --- fs/ceph/mds_client.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)