diff mbox series

[v2] ceph: fail the open_by_handle_at() if the dentry is being unlinked

Message ID 20220829045728.488148-1-xiubli@redhat.com (mailing list archive)
State New, archived
Headers show
Series [v2] ceph: fail the open_by_handle_at() if the dentry is being unlinked | expand

Commit Message

Xiubo Li Aug. 29, 2022, 4:57 a.m. UTC
From: Xiubo Li <xiubli@redhat.com>

When unlinking a file the kclient will send a unlink request to MDS
by holding the dentry reference, and then the MDS will return 2 replies,
which are unsafe reply and a deferred safe reply.

After the unsafe reply received the kernel will return and succeed
the unlink request to user space apps.

Only when the safe reply received the dentry's reference will be
released. Or the dentry will only be unhashed from dcache. But when
the open_by_handle_at() begins to open the unlinked files it will
succeed.

URL: https://tracker.ceph.com/issues/56524
Signed-off-by: Xiubo Li <xiubli@redhat.com>
---

V2:
- If the dentry was released and inode is evicted such as by dropping
  the caches, it will allocate a new dentry, which is also unhashed.


 fs/ceph/export.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

Comments

Jeff Layton Aug. 29, 2022, 6:17 p.m. UTC | #1
On Mon, 2022-08-29 at 12:57 +0800, xiubli@redhat.com wrote:
> From: Xiubo Li <xiubli@redhat.com>
> 
> When unlinking a file the kclient will send a unlink request to MDS
> by holding the dentry reference, and then the MDS will return 2 replies,
> which are unsafe reply and a deferred safe reply.
> 
> After the unsafe reply received the kernel will return and succeed
> the unlink request to user space apps.
> 
> Only when the safe reply received the dentry's reference will be
> released. Or the dentry will only be unhashed from dcache. But when
> the open_by_handle_at() begins to open the unlinked files it will
> succeed.
> 
> URL: https://tracker.ceph.com/issues/56524
> Signed-off-by: Xiubo Li <xiubli@redhat.com>
> ---
> 
> V2:
> - If the dentry was released and inode is evicted such as by dropping
>   the caches, it will allocate a new dentry, which is also unhashed.
> 
> 
>  fs/ceph/export.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/ceph/export.c b/fs/ceph/export.c
> index 0ebf2bd93055..5edc1d31cd79 100644
> --- a/fs/ceph/export.c
> +++ b/fs/ceph/export.c
> @@ -182,6 +182,7 @@ struct inode *ceph_lookup_inode(struct super_block *sb, u64 ino)
>  static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino)
>  {
>  	struct inode *inode = __lookup_inode(sb, ino);
> +	struct dentry *dentry;
>  	int err;
>  
>  	if (IS_ERR(inode))
> @@ -197,7 +198,21 @@ static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino)
>  		iput(inode);
>  		return ERR_PTR(-ESTALE);
>  	}
> -	return d_obtain_alias(inode);
> +
> +	/*
> +	 * -ESTALE if the dentry exists and is unhashed,
> +	 * which should be being released
> +	 */
> +	dentry = d_find_any_alias(inode);
> +	if (dentry && unlikely(d_unhashed(dentry))) {
> +		dput(dentry);
> +		return ERR_PTR(-ESTALE);
> +	}
> +
> +	if (!dentry)
> +		dentry = d_obtain_alias(inode);
> +
> +	return dentry;
>  }
>  
>  static struct dentry *__snapfh_to_dentry(struct super_block *sb,

This looks racy.

Suppose we have 2 racing tasks calling __fh_to_dentry for the same
inode. The first one races in and doesn't find anything. d_obtain alias
creates a disconnected dentry and returns it. The next task then finds
it, sees that it's disconnected and gets back -ESTALE.

I think you may need to detect this situation in a different way.
Xiubo Li Aug. 30, 2022, 2:30 a.m. UTC | #2
On 8/30/22 2:17 AM, Jeff Layton wrote:
> On Mon, 2022-08-29 at 12:57 +0800, xiubli@redhat.com wrote:
>> From: Xiubo Li <xiubli@redhat.com>
>>
>> When unlinking a file the kclient will send a unlink request to MDS
>> by holding the dentry reference, and then the MDS will return 2 replies,
>> which are unsafe reply and a deferred safe reply.
>>
>> After the unsafe reply received the kernel will return and succeed
>> the unlink request to user space apps.
>>
>> Only when the safe reply received the dentry's reference will be
>> released. Or the dentry will only be unhashed from dcache. But when
>> the open_by_handle_at() begins to open the unlinked files it will
>> succeed.
>>
>> URL: https://tracker.ceph.com/issues/56524
>> Signed-off-by: Xiubo Li <xiubli@redhat.com>
>> ---
>>
>> V2:
>> - If the dentry was released and inode is evicted such as by dropping
>>    the caches, it will allocate a new dentry, which is also unhashed.
>>
>>
>>   fs/ceph/export.c | 17 ++++++++++++++++-
>>   1 file changed, 16 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/ceph/export.c b/fs/ceph/export.c
>> index 0ebf2bd93055..5edc1d31cd79 100644
>> --- a/fs/ceph/export.c
>> +++ b/fs/ceph/export.c
>> @@ -182,6 +182,7 @@ struct inode *ceph_lookup_inode(struct super_block *sb, u64 ino)
>>   static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino)
>>   {
>>   	struct inode *inode = __lookup_inode(sb, ino);
>> +	struct dentry *dentry;
>>   	int err;
>>   
>>   	if (IS_ERR(inode))
>> @@ -197,7 +198,21 @@ static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino)
>>   		iput(inode);
>>   		return ERR_PTR(-ESTALE);
>>   	}
>> -	return d_obtain_alias(inode);
>> +
>> +	/*
>> +	 * -ESTALE if the dentry exists and is unhashed,
>> +	 * which should be being released
>> +	 */
>> +	dentry = d_find_any_alias(inode);
>> +	if (dentry && unlikely(d_unhashed(dentry))) {
>> +		dput(dentry);
>> +		return ERR_PTR(-ESTALE);
>> +	}
>> +
>> +	if (!dentry)
>> +		dentry = d_obtain_alias(inode);
>> +
>> +	return dentry;
>>   }
>>   
>>   static struct dentry *__snapfh_to_dentry(struct super_block *sb,
> This looks racy.
>
> Suppose we have 2 racing tasks calling __fh_to_dentry for the same
> inode. The first one races in and doesn't find anything. d_obtain alias
> creates a disconnected dentry and returns it. The next task then finds
> it, sees that it's disconnected and gets back -ESTALE.
>
> I think you may need to detect this situation in a different way.

Yeah, you're right. Locally I have another version of patch, which will 
add one di->flags bit, which is "CEPH_DENTRY_IS_UNLINKING".

If the file have hard links and there are more than one alias and one of 
them is being unlinked, shouldn't we make sure we will pick a normal one 
here ? If so we should iterate all the alias and filter out the being 
unlinked ones.

At the same time I found another issue for the "ceph_fh_to_dentry()". 
That is we never check the inode->i_generation like other filesystems, 
which will make sure the inode we are trying to open is the exactly the 
same one saved in userspace. The inode maybe deleted and created before 
this.

Thanks

Xiubo
Xiubo Li Aug. 30, 2022, 6:29 a.m. UTC | #3
On 8/30/22 10:30 AM, Xiubo Li wrote:
>
> On 8/30/22 2:17 AM, Jeff Layton wrote:
>> On Mon, 2022-08-29 at 12:57 +0800, xiubli@redhat.com wrote:
>>> From: Xiubo Li <xiubli@redhat.com>
>>>
>>> When unlinking a file the kclient will send a unlink request to MDS
>>> by holding the dentry reference, and then the MDS will return 2 
>>> replies,
>>> which are unsafe reply and a deferred safe reply.
>>>
>>> After the unsafe reply received the kernel will return and succeed
>>> the unlink request to user space apps.
>>>
>>> Only when the safe reply received the dentry's reference will be
>>> released. Or the dentry will only be unhashed from dcache. But when
>>> the open_by_handle_at() begins to open the unlinked files it will
>>> succeed.
>>>
>>> URL: https://tracker.ceph.com/issues/56524
>>> Signed-off-by: Xiubo Li <xiubli@redhat.com>
>>> ---
>>>
>>> V2:
>>> - If the dentry was released and inode is evicted such as by dropping
>>>    the caches, it will allocate a new dentry, which is also unhashed.
>>>
>>>
>>>   fs/ceph/export.c | 17 ++++++++++++++++-
>>>   1 file changed, 16 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/ceph/export.c b/fs/ceph/export.c
>>> index 0ebf2bd93055..5edc1d31cd79 100644
>>> --- a/fs/ceph/export.c
>>> +++ b/fs/ceph/export.c
>>> @@ -182,6 +182,7 @@ struct inode *ceph_lookup_inode(struct 
>>> super_block *sb, u64 ino)
>>>   static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino)
>>>   {
>>>       struct inode *inode = __lookup_inode(sb, ino);
>>> +    struct dentry *dentry;
>>>       int err;
>>>         if (IS_ERR(inode))
>>> @@ -197,7 +198,21 @@ static struct dentry *__fh_to_dentry(struct 
>>> super_block *sb, u64 ino)
>>>           iput(inode);
>>>           return ERR_PTR(-ESTALE);
>>>       }
>>> -    return d_obtain_alias(inode);
>>> +
>>> +    /*
>>> +     * -ESTALE if the dentry exists and is unhashed,
>>> +     * which should be being released
>>> +     */
>>> +    dentry = d_find_any_alias(inode);
>>> +    if (dentry && unlikely(d_unhashed(dentry))) {
>>> +        dput(dentry);
>>> +        return ERR_PTR(-ESTALE);
>>> +    }
>>> +
>>> +    if (!dentry)
>>> +        dentry = d_obtain_alias(inode);
>>> +
>>> +    return dentry;
>>>   }
>>>     static struct dentry *__snapfh_to_dentry(struct super_block *sb,
>> This looks racy.
>>
>> Suppose we have 2 racing tasks calling __fh_to_dentry for the same
>> inode. The first one races in and doesn't find anything. d_obtain alias
>> creates a disconnected dentry and returns it. The next task then finds
>> it, sees that it's disconnected and gets back -ESTALE.
>>
>> I think you may need to detect this situation in a different way.
>
> Yeah, you're right. Locally I have another version of patch, which 
> will add one di->flags bit, which is "CEPH_DENTRY_IS_UNLINKING".
>
> If the file have hard links and there are more than one alias and one 
> of them is being unlinked, shouldn't we make sure we will pick a 
> normal one here ? If so we should iterate all the alias and filter out 
> the being unlinked ones.
>
> At the same time I found another issue for the "ceph_fh_to_dentry()". 
> That is we never check the inode->i_generation like other filesystems, 
> which will make sure the inode we are trying to open is the exactly 
> the same one saved in userspace. The inode maybe deleted and created 
> before this.
>
It seems never happening for ceph. The MDS will make sure it won't reuse 
the freed ino#.

- Xiubo

> Thanks
>
> Xiubo
>
>
Jeff Layton Aug. 30, 2022, 10:22 a.m. UTC | #4
On Tue, 2022-08-30 at 10:30 +0800, Xiubo Li wrote:
> On 8/30/22 2:17 AM, Jeff Layton wrote:
> > On Mon, 2022-08-29 at 12:57 +0800, xiubli@redhat.com wrote:
> > > From: Xiubo Li <xiubli@redhat.com>
> > > 
> > > When unlinking a file the kclient will send a unlink request to MDS
> > > by holding the dentry reference, and then the MDS will return 2 replies,
> > > which are unsafe reply and a deferred safe reply.
> > > 
> > > After the unsafe reply received the kernel will return and succeed
> > > the unlink request to user space apps.
> > > 
> > > Only when the safe reply received the dentry's reference will be
> > > released. Or the dentry will only be unhashed from dcache. But when
> > > the open_by_handle_at() begins to open the unlinked files it will
> > > succeed.
> > > 
> > > URL: https://tracker.ceph.com/issues/56524
> > > Signed-off-by: Xiubo Li <xiubli@redhat.com>
> > > ---
> > > 
> > > V2:
> > > - If the dentry was released and inode is evicted such as by dropping
> > >    the caches, it will allocate a new dentry, which is also unhashed.
> > > 
> > > 
> > >   fs/ceph/export.c | 17 ++++++++++++++++-
> > >   1 file changed, 16 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/ceph/export.c b/fs/ceph/export.c
> > > index 0ebf2bd93055..5edc1d31cd79 100644
> > > --- a/fs/ceph/export.c
> > > +++ b/fs/ceph/export.c
> > > @@ -182,6 +182,7 @@ struct inode *ceph_lookup_inode(struct super_block *sb, u64 ino)
> > >   static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino)
> > >   {
> > >   	struct inode *inode = __lookup_inode(sb, ino);
> > > +	struct dentry *dentry;
> > >   	int err;
> > >   
> > >   	if (IS_ERR(inode))
> > > @@ -197,7 +198,21 @@ static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino)
> > >   		iput(inode);
> > >   		return ERR_PTR(-ESTALE);
> > >   	}
> > > -	return d_obtain_alias(inode);
> > > +
> > > +	/*
> > > +	 * -ESTALE if the dentry exists and is unhashed,
> > > +	 * which should be being released
> > > +	 */
> > > +	dentry = d_find_any_alias(inode);
> > > +	if (dentry && unlikely(d_unhashed(dentry))) {
> > > +		dput(dentry);
> > > +		return ERR_PTR(-ESTALE);
> > > +	}
> > > +
> > > +	if (!dentry)
> > > +		dentry = d_obtain_alias(inode);
> > > +
> > > +	return dentry;
> > >   }
> > >   
> > >   static struct dentry *__snapfh_to_dentry(struct super_block *sb,
> > This looks racy.
> > 
> > Suppose we have 2 racing tasks calling __fh_to_dentry for the same
> > inode. The first one races in and doesn't find anything. d_obtain alias
> > creates a disconnected dentry and returns it. The next task then finds
> > it, sees that it's disconnected and gets back -ESTALE.
> > 
> > I think you may need to detect this situation in a different way.
> 
> Yeah, you're right. Locally I have another version of patch, which will 
> add one di->flags bit, which is "CEPH_DENTRY_IS_UNLINKING".
> 
> If the file have hard links and there are more than one alias and one of 
> them is being unlinked, shouldn't we make sure we will pick a normal one 
> here ? If so we should iterate all the alias and filter out the being 
> unlinked ones.
> 

Possibly. Another option might be to try to catch this in the open
codepath instead. Test whether the dentry you're trying to open has the
IS_UNLINKING flag set, and return -ESTALE on open if so.

If the problem goes beyond open though, then that may not be sufficient.

> At the same time I found another issue for the "ceph_fh_to_dentry()". 
> That is we never check the inode->i_generation like other filesystems, 
> which will make sure the inode we are trying to open is the exactly the 
> same one saved in userspace. The inode maybe deleted and created before 
> this.
> 
> Thanks
> 
> Xiubo
> 
>
diff mbox series

Patch

diff --git a/fs/ceph/export.c b/fs/ceph/export.c
index 0ebf2bd93055..5edc1d31cd79 100644
--- a/fs/ceph/export.c
+++ b/fs/ceph/export.c
@@ -182,6 +182,7 @@  struct inode *ceph_lookup_inode(struct super_block *sb, u64 ino)
 static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino)
 {
 	struct inode *inode = __lookup_inode(sb, ino);
+	struct dentry *dentry;
 	int err;
 
 	if (IS_ERR(inode))
@@ -197,7 +198,21 @@  static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino)
 		iput(inode);
 		return ERR_PTR(-ESTALE);
 	}
-	return d_obtain_alias(inode);
+
+	/*
+	 * -ESTALE if the dentry exists and is unhashed,
+	 * which should be being released
+	 */
+	dentry = d_find_any_alias(inode);
+	if (dentry && unlikely(d_unhashed(dentry))) {
+		dput(dentry);
+		return ERR_PTR(-ESTALE);
+	}
+
+	if (!dentry)
+		dentry = d_obtain_alias(inode);
+
+	return dentry;
 }
 
 static struct dentry *__snapfh_to_dentry(struct super_block *sb,