Message ID | 20220829045728.488148-1-xiubli@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] ceph: fail the open_by_handle_at() if the dentry is being unlinked | expand |
On Mon, 2022-08-29 at 12:57 +0800, xiubli@redhat.com wrote: > From: Xiubo Li <xiubli@redhat.com> > > When unlinking a file the kclient will send a unlink request to MDS > by holding the dentry reference, and then the MDS will return 2 replies, > which are unsafe reply and a deferred safe reply. > > After the unsafe reply received the kernel will return and succeed > the unlink request to user space apps. > > Only when the safe reply received the dentry's reference will be > released. Or the dentry will only be unhashed from dcache. But when > the open_by_handle_at() begins to open the unlinked files it will > succeed. > > URL: https://tracker.ceph.com/issues/56524 > Signed-off-by: Xiubo Li <xiubli@redhat.com> > --- > > V2: > - If the dentry was released and inode is evicted such as by dropping > the caches, it will allocate a new dentry, which is also unhashed. > > > fs/ceph/export.c | 17 ++++++++++++++++- > 1 file changed, 16 insertions(+), 1 deletion(-) > > diff --git a/fs/ceph/export.c b/fs/ceph/export.c > index 0ebf2bd93055..5edc1d31cd79 100644 > --- a/fs/ceph/export.c > +++ b/fs/ceph/export.c > @@ -182,6 +182,7 @@ struct inode *ceph_lookup_inode(struct super_block *sb, u64 ino) > static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino) > { > struct inode *inode = __lookup_inode(sb, ino); > + struct dentry *dentry; > int err; > > if (IS_ERR(inode)) > @@ -197,7 +198,21 @@ static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino) > iput(inode); > return ERR_PTR(-ESTALE); > } > - return d_obtain_alias(inode); > + > + /* > + * -ESTALE if the dentry exists and is unhashed, > + * which should be being released > + */ > + dentry = d_find_any_alias(inode); > + if (dentry && unlikely(d_unhashed(dentry))) { > + dput(dentry); > + return ERR_PTR(-ESTALE); > + } > + > + if (!dentry) > + dentry = d_obtain_alias(inode); > + > + return dentry; > } > > static struct dentry *__snapfh_to_dentry(struct super_block *sb, This looks racy. Suppose we have 2 racing tasks calling __fh_to_dentry for the same inode. The first one races in and doesn't find anything. d_obtain alias creates a disconnected dentry and returns it. The next task then finds it, sees that it's disconnected and gets back -ESTALE. I think you may need to detect this situation in a different way.
On 8/30/22 2:17 AM, Jeff Layton wrote: > On Mon, 2022-08-29 at 12:57 +0800, xiubli@redhat.com wrote: >> From: Xiubo Li <xiubli@redhat.com> >> >> When unlinking a file the kclient will send a unlink request to MDS >> by holding the dentry reference, and then the MDS will return 2 replies, >> which are unsafe reply and a deferred safe reply. >> >> After the unsafe reply received the kernel will return and succeed >> the unlink request to user space apps. >> >> Only when the safe reply received the dentry's reference will be >> released. Or the dentry will only be unhashed from dcache. But when >> the open_by_handle_at() begins to open the unlinked files it will >> succeed. >> >> URL: https://tracker.ceph.com/issues/56524 >> Signed-off-by: Xiubo Li <xiubli@redhat.com> >> --- >> >> V2: >> - If the dentry was released and inode is evicted such as by dropping >> the caches, it will allocate a new dentry, which is also unhashed. >> >> >> fs/ceph/export.c | 17 ++++++++++++++++- >> 1 file changed, 16 insertions(+), 1 deletion(-) >> >> diff --git a/fs/ceph/export.c b/fs/ceph/export.c >> index 0ebf2bd93055..5edc1d31cd79 100644 >> --- a/fs/ceph/export.c >> +++ b/fs/ceph/export.c >> @@ -182,6 +182,7 @@ struct inode *ceph_lookup_inode(struct super_block *sb, u64 ino) >> static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino) >> { >> struct inode *inode = __lookup_inode(sb, ino); >> + struct dentry *dentry; >> int err; >> >> if (IS_ERR(inode)) >> @@ -197,7 +198,21 @@ static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino) >> iput(inode); >> return ERR_PTR(-ESTALE); >> } >> - return d_obtain_alias(inode); >> + >> + /* >> + * -ESTALE if the dentry exists and is unhashed, >> + * which should be being released >> + */ >> + dentry = d_find_any_alias(inode); >> + if (dentry && unlikely(d_unhashed(dentry))) { >> + dput(dentry); >> + return ERR_PTR(-ESTALE); >> + } >> + >> + if (!dentry) >> + dentry = d_obtain_alias(inode); >> + >> + return dentry; >> } >> >> static struct dentry *__snapfh_to_dentry(struct super_block *sb, > This looks racy. > > Suppose we have 2 racing tasks calling __fh_to_dentry for the same > inode. The first one races in and doesn't find anything. d_obtain alias > creates a disconnected dentry and returns it. The next task then finds > it, sees that it's disconnected and gets back -ESTALE. > > I think you may need to detect this situation in a different way. Yeah, you're right. Locally I have another version of patch, which will add one di->flags bit, which is "CEPH_DENTRY_IS_UNLINKING". If the file have hard links and there are more than one alias and one of them is being unlinked, shouldn't we make sure we will pick a normal one here ? If so we should iterate all the alias and filter out the being unlinked ones. At the same time I found another issue for the "ceph_fh_to_dentry()". That is we never check the inode->i_generation like other filesystems, which will make sure the inode we are trying to open is the exactly the same one saved in userspace. The inode maybe deleted and created before this. Thanks Xiubo
On 8/30/22 10:30 AM, Xiubo Li wrote: > > On 8/30/22 2:17 AM, Jeff Layton wrote: >> On Mon, 2022-08-29 at 12:57 +0800, xiubli@redhat.com wrote: >>> From: Xiubo Li <xiubli@redhat.com> >>> >>> When unlinking a file the kclient will send a unlink request to MDS >>> by holding the dentry reference, and then the MDS will return 2 >>> replies, >>> which are unsafe reply and a deferred safe reply. >>> >>> After the unsafe reply received the kernel will return and succeed >>> the unlink request to user space apps. >>> >>> Only when the safe reply received the dentry's reference will be >>> released. Or the dentry will only be unhashed from dcache. But when >>> the open_by_handle_at() begins to open the unlinked files it will >>> succeed. >>> >>> URL: https://tracker.ceph.com/issues/56524 >>> Signed-off-by: Xiubo Li <xiubli@redhat.com> >>> --- >>> >>> V2: >>> - If the dentry was released and inode is evicted such as by dropping >>> the caches, it will allocate a new dentry, which is also unhashed. >>> >>> >>> fs/ceph/export.c | 17 ++++++++++++++++- >>> 1 file changed, 16 insertions(+), 1 deletion(-) >>> >>> diff --git a/fs/ceph/export.c b/fs/ceph/export.c >>> index 0ebf2bd93055..5edc1d31cd79 100644 >>> --- a/fs/ceph/export.c >>> +++ b/fs/ceph/export.c >>> @@ -182,6 +182,7 @@ struct inode *ceph_lookup_inode(struct >>> super_block *sb, u64 ino) >>> static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino) >>> { >>> struct inode *inode = __lookup_inode(sb, ino); >>> + struct dentry *dentry; >>> int err; >>> if (IS_ERR(inode)) >>> @@ -197,7 +198,21 @@ static struct dentry *__fh_to_dentry(struct >>> super_block *sb, u64 ino) >>> iput(inode); >>> return ERR_PTR(-ESTALE); >>> } >>> - return d_obtain_alias(inode); >>> + >>> + /* >>> + * -ESTALE if the dentry exists and is unhashed, >>> + * which should be being released >>> + */ >>> + dentry = d_find_any_alias(inode); >>> + if (dentry && unlikely(d_unhashed(dentry))) { >>> + dput(dentry); >>> + return ERR_PTR(-ESTALE); >>> + } >>> + >>> + if (!dentry) >>> + dentry = d_obtain_alias(inode); >>> + >>> + return dentry; >>> } >>> static struct dentry *__snapfh_to_dentry(struct super_block *sb, >> This looks racy. >> >> Suppose we have 2 racing tasks calling __fh_to_dentry for the same >> inode. The first one races in and doesn't find anything. d_obtain alias >> creates a disconnected dentry and returns it. The next task then finds >> it, sees that it's disconnected and gets back -ESTALE. >> >> I think you may need to detect this situation in a different way. > > Yeah, you're right. Locally I have another version of patch, which > will add one di->flags bit, which is "CEPH_DENTRY_IS_UNLINKING". > > If the file have hard links and there are more than one alias and one > of them is being unlinked, shouldn't we make sure we will pick a > normal one here ? If so we should iterate all the alias and filter out > the being unlinked ones. > > At the same time I found another issue for the "ceph_fh_to_dentry()". > That is we never check the inode->i_generation like other filesystems, > which will make sure the inode we are trying to open is the exactly > the same one saved in userspace. The inode maybe deleted and created > before this. > It seems never happening for ceph. The MDS will make sure it won't reuse the freed ino#. - Xiubo > Thanks > > Xiubo > >
On Tue, 2022-08-30 at 10:30 +0800, Xiubo Li wrote: > On 8/30/22 2:17 AM, Jeff Layton wrote: > > On Mon, 2022-08-29 at 12:57 +0800, xiubli@redhat.com wrote: > > > From: Xiubo Li <xiubli@redhat.com> > > > > > > When unlinking a file the kclient will send a unlink request to MDS > > > by holding the dentry reference, and then the MDS will return 2 replies, > > > which are unsafe reply and a deferred safe reply. > > > > > > After the unsafe reply received the kernel will return and succeed > > > the unlink request to user space apps. > > > > > > Only when the safe reply received the dentry's reference will be > > > released. Or the dentry will only be unhashed from dcache. But when > > > the open_by_handle_at() begins to open the unlinked files it will > > > succeed. > > > > > > URL: https://tracker.ceph.com/issues/56524 > > > Signed-off-by: Xiubo Li <xiubli@redhat.com> > > > --- > > > > > > V2: > > > - If the dentry was released and inode is evicted such as by dropping > > > the caches, it will allocate a new dentry, which is also unhashed. > > > > > > > > > fs/ceph/export.c | 17 ++++++++++++++++- > > > 1 file changed, 16 insertions(+), 1 deletion(-) > > > > > > diff --git a/fs/ceph/export.c b/fs/ceph/export.c > > > index 0ebf2bd93055..5edc1d31cd79 100644 > > > --- a/fs/ceph/export.c > > > +++ b/fs/ceph/export.c > > > @@ -182,6 +182,7 @@ struct inode *ceph_lookup_inode(struct super_block *sb, u64 ino) > > > static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino) > > > { > > > struct inode *inode = __lookup_inode(sb, ino); > > > + struct dentry *dentry; > > > int err; > > > > > > if (IS_ERR(inode)) > > > @@ -197,7 +198,21 @@ static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino) > > > iput(inode); > > > return ERR_PTR(-ESTALE); > > > } > > > - return d_obtain_alias(inode); > > > + > > > + /* > > > + * -ESTALE if the dentry exists and is unhashed, > > > + * which should be being released > > > + */ > > > + dentry = d_find_any_alias(inode); > > > + if (dentry && unlikely(d_unhashed(dentry))) { > > > + dput(dentry); > > > + return ERR_PTR(-ESTALE); > > > + } > > > + > > > + if (!dentry) > > > + dentry = d_obtain_alias(inode); > > > + > > > + return dentry; > > > } > > > > > > static struct dentry *__snapfh_to_dentry(struct super_block *sb, > > This looks racy. > > > > Suppose we have 2 racing tasks calling __fh_to_dentry for the same > > inode. The first one races in and doesn't find anything. d_obtain alias > > creates a disconnected dentry and returns it. The next task then finds > > it, sees that it's disconnected and gets back -ESTALE. > > > > I think you may need to detect this situation in a different way. > > Yeah, you're right. Locally I have another version of patch, which will > add one di->flags bit, which is "CEPH_DENTRY_IS_UNLINKING". > > If the file have hard links and there are more than one alias and one of > them is being unlinked, shouldn't we make sure we will pick a normal one > here ? If so we should iterate all the alias and filter out the being > unlinked ones. > Possibly. Another option might be to try to catch this in the open codepath instead. Test whether the dentry you're trying to open has the IS_UNLINKING flag set, and return -ESTALE on open if so. If the problem goes beyond open though, then that may not be sufficient. > At the same time I found another issue for the "ceph_fh_to_dentry()". > That is we never check the inode->i_generation like other filesystems, > which will make sure the inode we are trying to open is the exactly the > same one saved in userspace. The inode maybe deleted and created before > this. > > Thanks > > Xiubo > >
diff --git a/fs/ceph/export.c b/fs/ceph/export.c index 0ebf2bd93055..5edc1d31cd79 100644 --- a/fs/ceph/export.c +++ b/fs/ceph/export.c @@ -182,6 +182,7 @@ struct inode *ceph_lookup_inode(struct super_block *sb, u64 ino) static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino) { struct inode *inode = __lookup_inode(sb, ino); + struct dentry *dentry; int err; if (IS_ERR(inode)) @@ -197,7 +198,21 @@ static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino) iput(inode); return ERR_PTR(-ESTALE); } - return d_obtain_alias(inode); + + /* + * -ESTALE if the dentry exists and is unhashed, + * which should be being released + */ + dentry = d_find_any_alias(inode); + if (dentry && unlikely(d_unhashed(dentry))) { + dput(dentry); + return ERR_PTR(-ESTALE); + } + + if (!dentry) + dentry = d_obtain_alias(inode); + + return dentry; } static struct dentry *__snapfh_to_dentry(struct super_block *sb,