diff mbox

[v2,04/17] ovl: decode connected upper dir file handles

Message ID 1515086449-26563-5-git-send-email-amir73il@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Amir Goldstein Jan. 4, 2018, 5:20 p.m. UTC
Until this change, we decoded upper file handles by instantiating an
overlay dentry from the real upper dentry. This is sufficient to handle
pure upper files, but insufficient to handle merge/impure dirs.

To that end, if decoded real upper dir is connected and hashed, we
lookup an overlay dentry with the same path as the real upper dir.
If decoded real upper is non-dir, we instantiate a disconnected overlay
dentry as before this change.

Because ovl_fh_to_dentry() returns connected overlay dir dentries,
exportfs never need to call get_parent() and get_name() to reconnect an
upper overlay dir. Because connectable non-dir file handles are not
supported, exportfs will not be able to use fh_to_parent() and get_name()
methods to reconnect a disconnected non-dir to its parent. Therefore, the
methods get_parent() and get_name() are implemented just to print out a
sanity warning and the method fh_to_parent() is implemented to warn the
user that using the 'subtree_check' exportfs option is not supported.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/export.c | 172 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 171 insertions(+), 1 deletion(-)

Comments

Amir Goldstein Jan. 5, 2018, 12:33 p.m. UTC | #1
On Thu, Jan 4, 2018 at 7:20 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> Until this change, we decoded upper file handles by instantiating an
> overlay dentry from the real upper dentry. This is sufficient to handle
> pure upper files, but insufficient to handle merge/impure dirs.
>
> To that end, if decoded real upper dir is connected and hashed, we
> lookup an overlay dentry with the same path as the real upper dir.
> If decoded real upper is non-dir, we instantiate a disconnected overlay
> dentry as before this change.
>
> Because ovl_fh_to_dentry() returns connected overlay dir dentries,
> exportfs never need to call get_parent() and get_name() to reconnect an
> upper overlay dir. Because connectable non-dir file handles are not
> supported, exportfs will not be able to use fh_to_parent() and get_name()
> methods to reconnect a disconnected non-dir to its parent. Therefore, the
> methods get_parent() and get_name() are implemented just to print out a
> sanity warning and the method fh_to_parent() is implemented to warn the
> user that using the 'subtree_check' exportfs option is not supported.
>

Reviewers who will get this far, should have their eyebrows slightly raised
after reading this commit message and should be asking themselves:

"Why not return a disconnected overlay dentry like any other fs and implement
ovl_get_parent()/ovl_get_name() by looking at parent/name of upper dir?"

I have had this debate with myself for a while and experimented a bit with
both approaches and in the end, I liked the "return connected dentry" result
better. I did not want to write this entire story in commit message, because
in the end, there is nothing incorrect about the choice of either implementation
there are only pros and cons to each choice.

At the moment, the only argument I can think of to counter the chosen approach
is that it adds ~100 lines on code in ovl_lookup_real() and
ovl_lookup_real_one()
helpers that could have been avoided by using the common reconnect_path()
code in fs/exportfs/expfs.c.

The arguments to counter the disconnected dir approach are:
- Obtaining a disconnected overlay dir dentry would requires a
delicate re-factoring
  of ovl_lookup() to get a dentry with overlay parent info. I
personally preferred to
  avoid doing that re-factoring unless it was proven worthy.
- Going down the path of disconnected dir would mean that the (non trivial) code
  path of d_splice_alias() could be traveled and that meant writing
more tests and
  introduces race cases that are very hard to hit on purpose. Taking the path of
  connecting overlay dentry by forward lookup is therefore the safe and boring
  way to avoid surprises.
- In the current implementation, there is an anomaly in the multi
lower layer setup.
  In that case, indexed upper dir inodes are hashed by the lower
inode, but their file
  handles are encoded from the upper inode. Obtaining a disconnected
dir from this
  type of upper file handle would have been a special case that would add more
  code and more complexity. With the forward lookup connect approach, the
  anomaly does not require changing the code - connecting the dentry
is just less
  efficient in case there is an ancestor in inode cache (we won't find
it in cache
  because we will be looking with the wrong inode) and that can be
fixed later if
  we find that use case important enough.

There. Now you see why I did not want this story in commit message?

Amir.
J. Bruce Fields Jan. 5, 2018, 3:18 p.m. UTC | #2
On Fri, Jan 05, 2018 at 02:33:22PM +0200, Amir Goldstein wrote:
> There. Now you see why I did not want this story in commit message?

No.  I think it's interesting, and might be useful to have around if
someone needs to revisit this decision in the future.  So I'd rather
have it in the changelog or in code comments.  I've had to track down
old mailing list threads for this kind of information in the past and
found it sometimes time-consuming.

--b.
Amir Goldstein Jan. 5, 2018, 3:34 p.m. UTC | #3
On Fri, Jan 5, 2018 at 5:18 PM, J . Bruce Fields <bfields@fieldses.org> wrote:
> On Fri, Jan 05, 2018 at 02:33:22PM +0200, Amir Goldstein wrote:
>> There. Now you see why I did not want this story in commit message?
>
> No.  I think it's interesting, and might be useful to have around if
> someone needs to revisit this decision in the future.  So I'd rather
> have it in the changelog or in code comments.  I've had to track down
> old mailing list threads for this kind of information in the past and
> found it sometimes time-consuming.
>

Fair enough, but I'll wait to hear from Miklos first, because he may
have different arguments, or maybe he will call BS on some of my
arguments.

Thanks,
Amir.
Miklos Szeredi Jan. 15, 2018, 11:33 a.m. UTC | #4
On Thu, Jan 4, 2018 at 6:20 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> Until this change, we decoded upper file handles by instantiating an
> overlay dentry from the real upper dentry. This is sufficient to handle
> pure upper files, but insufficient to handle merge/impure dirs.
>
> To that end, if decoded real upper dir is connected and hashed, we
> lookup an overlay dentry with the same path as the real upper dir.
> If decoded real upper is non-dir, we instantiate a disconnected overlay
> dentry as before this change.
>
> Because ovl_fh_to_dentry() returns connected overlay dir dentries,
> exportfs never need to call get_parent() and get_name() to reconnect an
> upper overlay dir. Because connectable non-dir file handles are not
> supported, exportfs will not be able to use fh_to_parent() and get_name()
> methods to reconnect a disconnected non-dir to its parent. Therefore, the
> methods get_parent() and get_name() are implemented just to print out a
> sanity warning and the method fh_to_parent() is implemented to warn the
> user that using the 'subtree_check' exportfs option is not supported.
>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/overlayfs/export.c | 172 +++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 171 insertions(+), 1 deletion(-)
>
> diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
> index 5c72784a0b4d..48ae02f3acb8 100644
> --- a/fs/overlayfs/export.c
> +++ b/fs/overlayfs/export.c
> @@ -130,6 +130,145 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
>         return dentry;
>  }
>
> +/*
> + * Lookup a child overlay dentry whose real dentry is @real.
> + * If @is_upper is true then we lookup a child overlay dentry with the same
> + * name as the real dentry. Otherwise, we need to consult index for lookup.
> + */
> +static struct dentry *ovl_lookup_real_one(struct dentry *parent,
> +                                         struct dentry *real, bool is_upper)
> +{
> +       struct dentry *this;
> +       struct qstr *name = &real->d_name;
> +       int err;
> +
> +       /* TODO: use index when looking up by lower real dentry */
> +       if (!is_upper)
> +               return ERR_PTR(-EACCES);
> +
> +       /* Lookup overlay dentry by real name */
> +       this = lookup_one_len_unlocked(name->name, parent, name->len);
> +       err = PTR_ERR(this);
> +       if (IS_ERR(this)) {
> +               goto fail;
> +       } else if (!this || !this->d_inode) {
> +               dput(this);
> +               err = -ENOENT;
> +               goto fail;
> +       } else if (ovl_dentry_upper(this) != real) {
> +               dput(this);
> +               err = -ESTALE;
> +               goto fail;
> +       }
> +
> +       return this;
> +
> +fail:
> +       pr_warn_ratelimited("overlayfs: failed to lookup one by real (%pd2, is_upper=%d, parent=%pd2, err=%i)\n",
> +                           real, is_upper, parent, err);
> +       return ERR_PTR(err);
> +}
> +
> +/*
> + * Lookup an overlay dentry whose real dentry is @real.
> + * If @is_upper is true then we lookup an overlay dentry with the same path
> + * as the real dentry. Otherwise, we need to consult index for lookup.
> + */
> +static struct dentry *ovl_lookup_real(struct super_block *sb,
> +                                     struct dentry *real, bool is_upper)
> +{
> +       struct dentry *connected;
> +       int err = 0;
> +
> +       /* TODO: use index when looking up by lower real dentry */
> +       if (!is_upper)
> +               return ERR_PTR(-EACCES);
> +
> +       connected = dget(sb->s_root);
> +       while (!err) {
> +               struct dentry *next, *this;
> +               struct dentry *parent = NULL;
> +               struct dentry *real_connected = ovl_dentry_upper(connected);
> +
> +               if (real_connected == real)
> +                       break;
> +
> +               next = dget(real);
> +               /* find the topmost dentry not yet connected */
> +               for (;;) {
> +                       parent = dget_parent(next);
> +
> +                       if (real_connected == parent)
> +                               break;
> +
> +                       /*
> +                        * If real file has been moved out of the layer root
> +                        * directory, we will eventully hit the real fs root.
> +                        */
> +                       if (parent == next) {
> +                               err = -EXDEV;
> +                               break;
> +                       }

This seems to assume no cross directory renames of directories in the
ancestry of "real", but AFAICS nothing prevents that.

Also why not use the inode cache to find already connected dirs?
Seems more efficient, than always going up to the root and going down
from there.

So, a working algorithm would be going up to the first connected
parent or root, lock parent, lookup name and restart.  Not guaranteed
to finish, since not protected against always racing with renames.
Can we take s_vfs_rename_sem on ovl to prevent that?

> +
> +                       dput(next);
> +                       next = parent;
> +               }
> +
> +               if (!err) {
> +                       this = ovl_lookup_real_one(connected, next, is_upper);
> +                       if (!IS_ERR(this)) {
> +                               dput(connected);
> +                               connected = this;
> +                       } else {
> +                               err = PTR_ERR(this);
> +                       }
> +               }
> +
> +               dput(parent);
> +               dput(next);
> +       }
> +
> +       if (err)
> +               goto fail;
> +
> +       return connected;
> +
> +fail:
> +       pr_warn_ratelimited("overlayfs: failed to lookup by real (%pd2, is_upper=%d, connected=%pd2, err=%i)\n",
> +                           real, is_upper, connected, err);
> +       dput(connected);
> +       return ERR_PTR(err);
> +}
> +
> +/*
> + * Get an overlay dentry from upper/lower real dentries.
> + */
> +static struct dentry *ovl_get_dentry(struct super_block *sb,
> +                                    struct dentry *upper,
> +                                    struct ovl_path *lowerpath)
> +{
> +       /* TODO: get non-upper dentry */
> +       if (!upper)
> +               return ERR_PTR(-EACCES);
> +
> +       /*
> +        * Obtain a disconnected overlay dentry from a non-dir real upper
> +        * dentry.
> +        */
> +       if (!d_is_dir(upper))
> +               return ovl_obtain_alias(sb, upper, NULL);
> +
> +       /* Removed empty directory? */
> +       if ((upper->d_flags & DCACHE_DISCONNECTED) || d_unhashed(upper))
> +               return ERR_PTR(-ENOENT);
> +
> +       /*
> +        * If real upper dentry is connected and hashed, get a connected
> +        * overlay dentry with the same path as the real upper dentry.
> +        */
> +       return ovl_lookup_real(sb, upper, true);
> +}
> +
>  static struct dentry *ovl_upper_fh_to_d(struct super_block *sb,
>                                         struct ovl_fh *fh)
>  {
> @@ -144,7 +283,7 @@ static struct dentry *ovl_upper_fh_to_d(struct super_block *sb,
>         if (IS_ERR_OR_NULL(upper))
>                 return upper;
>
> -       dentry = ovl_obtain_alias(sb, upper, NULL);
> +       dentry = ovl_get_dentry(sb, upper, NULL);
>         dput(upper);
>
>         return dentry;
> @@ -183,7 +322,38 @@ static struct dentry *ovl_fh_to_dentry(struct super_block *sb, struct fid *fid,
>         return ERR_PTR(err);
>  }
>
> +static struct dentry *ovl_fh_to_parent(struct super_block *sb, struct fid *fid,
> +                                      int fh_len, int fh_type)
> +{
> +       pr_warn_ratelimited("overlayfs: connectable file handles not supported; use 'no_subtree_check' exportfs option.\n");
> +       return ERR_PTR(-EACCES);
> +}
> +
> +static int ovl_get_name(struct dentry *parent, char *name,
> +                       struct dentry *child)
> +{
> +       /*
> +        * ovl_fh_to_dentry() returns connected dir overlay dentries and
> +        * ovl_fh_to_parent() is not implemented, so we should not get here.
> +        */
> +       WARN_ON_ONCE(1);
> +       return -EIO;
> +}
> +
> +static struct dentry *ovl_get_parent(struct dentry *dentry)
> +{
> +       /*
> +        * ovl_fh_to_dentry() returns connected dir overlay dentries, so we
> +        * should not get here.
> +        */
> +       WARN_ON_ONCE(1);
> +       return ERR_PTR(-EIO);
> +}
> +
>  const struct export_operations ovl_export_operations = {
>         .encode_fh      = ovl_encode_inode_fh,
>         .fh_to_dentry   = ovl_fh_to_dentry,
> +       .fh_to_parent   = ovl_fh_to_parent,
> +       .get_name       = ovl_get_name,
> +       .get_parent     = ovl_get_parent,
>  };
> --
> 2.7.4
>
Miklos Szeredi Jan. 15, 2018, 11:41 a.m. UTC | #5
On Fri, Jan 5, 2018 at 1:33 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Thu, Jan 4, 2018 at 7:20 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>> Until this change, we decoded upper file handles by instantiating an
>> overlay dentry from the real upper dentry. This is sufficient to handle
>> pure upper files, but insufficient to handle merge/impure dirs.
>>
>> To that end, if decoded real upper dir is connected and hashed, we
>> lookup an overlay dentry with the same path as the real upper dir.
>> If decoded real upper is non-dir, we instantiate a disconnected overlay
>> dentry as before this change.
>>
>> Because ovl_fh_to_dentry() returns connected overlay dir dentries,
>> exportfs never need to call get_parent() and get_name() to reconnect an
>> upper overlay dir. Because connectable non-dir file handles are not
>> supported, exportfs will not be able to use fh_to_parent() and get_name()
>> methods to reconnect a disconnected non-dir to its parent. Therefore, the
>> methods get_parent() and get_name() are implemented just to print out a
>> sanity warning and the method fh_to_parent() is implemented to warn the
>> user that using the 'subtree_check' exportfs option is not supported.
>>
>
> Reviewers who will get this far, should have their eyebrows slightly raised
> after reading this commit message and should be asking themselves:
>
> "Why not return a disconnected overlay dentry like any other fs and implement
> ovl_get_parent()/ovl_get_name() by looking at parent/name of upper dir?"
>
> I have had this debate with myself for a while and experimented a bit with
> both approaches and in the end, I liked the "return connected dentry" result
> better. I did not want to write this entire story in commit message, because
> in the end, there is nothing incorrect about the choice of either implementation
> there are only pros and cons to each choice.
>
> At the moment, the only argument I can think of to counter the chosen approach
> is that it adds ~100 lines on code in ovl_lookup_real() and
> ovl_lookup_real_one()
> helpers that could have been avoided by using the common reconnect_path()
> code in fs/exportfs/expfs.c.

And also not having to deal with rename races would be good.  And the
way to do it is the same way as ovl_get_redirect(), except now we are
walking the upper layer instead of the overlay layer.

Not sure which approach is better.

Thanks,
Miklos
Amir Goldstein Jan. 15, 2018, 12:20 p.m. UTC | #6
On Mon, Jan 15, 2018 at 1:33 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Thu, Jan 4, 2018 at 6:20 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>> Until this change, we decoded upper file handles by instantiating an
>> overlay dentry from the real upper dentry. This is sufficient to handle
>> pure upper files, but insufficient to handle merge/impure dirs.
>>
>> To that end, if decoded real upper dir is connected and hashed, we
>> lookup an overlay dentry with the same path as the real upper dir.
>> If decoded real upper is non-dir, we instantiate a disconnected overlay
>> dentry as before this change.
>>
>> Because ovl_fh_to_dentry() returns connected overlay dir dentries,
>> exportfs never need to call get_parent() and get_name() to reconnect an
>> upper overlay dir. Because connectable non-dir file handles are not
>> supported, exportfs will not be able to use fh_to_parent() and get_name()
>> methods to reconnect a disconnected non-dir to its parent. Therefore, the
>> methods get_parent() and get_name() are implemented just to print out a
>> sanity warning and the method fh_to_parent() is implemented to warn the
>> user that using the 'subtree_check' exportfs option is not supported.
>>
>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>> ---
>>  fs/overlayfs/export.c | 172 +++++++++++++++++++++++++++++++++++++++++++++++++-
>>  1 file changed, 171 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
>> index 5c72784a0b4d..48ae02f3acb8 100644
>> --- a/fs/overlayfs/export.c
>> +++ b/fs/overlayfs/export.c
>> @@ -130,6 +130,145 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
>>         return dentry;
>>  }
>>
>> +/*
>> + * Lookup a child overlay dentry whose real dentry is @real.
>> + * If @is_upper is true then we lookup a child overlay dentry with the same
>> + * name as the real dentry. Otherwise, we need to consult index for lookup.
>> + */
>> +static struct dentry *ovl_lookup_real_one(struct dentry *parent,
>> +                                         struct dentry *real, bool is_upper)
>> +{
>> +       struct dentry *this;
>> +       struct qstr *name = &real->d_name;
>> +       int err;
>> +
>> +       /* TODO: use index when looking up by lower real dentry */
>> +       if (!is_upper)
>> +               return ERR_PTR(-EACCES);
>> +
>> +       /* Lookup overlay dentry by real name */
>> +       this = lookup_one_len_unlocked(name->name, parent, name->len);
>> +       err = PTR_ERR(this);
>> +       if (IS_ERR(this)) {
>> +               goto fail;
>> +       } else if (!this || !this->d_inode) {
>> +               dput(this);
>> +               err = -ENOENT;
>> +               goto fail;
>> +       } else if (ovl_dentry_upper(this) != real) {
>> +               dput(this);
>> +               err = -ESTALE;
>> +               goto fail;
>> +       }
>> +
>> +       return this;
>> +
>> +fail:
>> +       pr_warn_ratelimited("overlayfs: failed to lookup one by real (%pd2, is_upper=%d, parent=%pd2, err=%i)\n",
>> +                           real, is_upper, parent, err);
>> +       return ERR_PTR(err);
>> +}
>> +
>> +/*
>> + * Lookup an overlay dentry whose real dentry is @real.
>> + * If @is_upper is true then we lookup an overlay dentry with the same path
>> + * as the real dentry. Otherwise, we need to consult index for lookup.
>> + */
>> +static struct dentry *ovl_lookup_real(struct super_block *sb,
>> +                                     struct dentry *real, bool is_upper)
>> +{
>> +       struct dentry *connected;
>> +       int err = 0;
>> +
>> +       /* TODO: use index when looking up by lower real dentry */
>> +       if (!is_upper)
>> +               return ERR_PTR(-EACCES);
>> +
>> +       connected = dget(sb->s_root);
>> +       while (!err) {
>> +               struct dentry *next, *this;
>> +               struct dentry *parent = NULL;
>> +               struct dentry *real_connected = ovl_dentry_upper(connected);
>> +
>> +               if (real_connected == real)
>> +                       break;
>> +
>> +               next = dget(real);
>> +               /* find the topmost dentry not yet connected */
>> +               for (;;) {
>> +                       parent = dget_parent(next);
>> +
>> +                       if (real_connected == parent)
>> +                               break;
>> +
>> +                       /*
>> +                        * If real file has been moved out of the layer root
>> +                        * directory, we will eventully hit the real fs root.
>> +                        */
>> +                       if (parent == next) {
>> +                               err = -EXDEV;
>> +                               break;
>> +                       }
>
> This seems to assume no cross directory renames of directories in the
> ancestry of "real", but AFAICS nothing prevents that.

Do you mean online modification of underlying fs? or rename in overlay?
For online modification fo underlying fs, I don't a reason to make it work.
-ESTALE would be a perfectly valid result in that case.

>
> Also why not use the inode cache to find already connected dirs?
> Seems more efficient, than always going up to the root and going down
> from there.

See patch [14/17] ovl: lookup connected ancestor of dir in inode cache
Sorry for ordering patches like this, it was more convenient to implement
the cold cache algorithm and then add hot cache into the mix.

>
> So, a working algorithm would be going up to the first connected
> parent or root, lock parent, lookup name and restart.  Not guaranteed
> to finish, since not protected against always racing with renames.
> Can we take s_vfs_rename_sem on ovl to prevent that?
>

Sounds like a simple and good enough solution.
Do we really need the locking of parent and restart connect if
we take s_vfs_rename_sem around ovl_lookup_real()?

Thanks,
Amir.
Miklos Szeredi Jan. 15, 2018, 2:56 p.m. UTC | #7
On Mon, Jan 15, 2018 at 1:20 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Mon, Jan 15, 2018 at 1:33 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Thu, Jan 4, 2018 at 6:20 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>>> Until this change, we decoded upper file handles by instantiating an
>>> overlay dentry from the real upper dentry. This is sufficient to handle
>>> pure upper files, but insufficient to handle merge/impure dirs.
>>>
>>> To that end, if decoded real upper dir is connected and hashed, we
>>> lookup an overlay dentry with the same path as the real upper dir.
>>> If decoded real upper is non-dir, we instantiate a disconnected overlay
>>> dentry as before this change.
>>>
>>> Because ovl_fh_to_dentry() returns connected overlay dir dentries,
>>> exportfs never need to call get_parent() and get_name() to reconnect an
>>> upper overlay dir. Because connectable non-dir file handles are not
>>> supported, exportfs will not be able to use fh_to_parent() and get_name()
>>> methods to reconnect a disconnected non-dir to its parent. Therefore, the
>>> methods get_parent() and get_name() are implemented just to print out a
>>> sanity warning and the method fh_to_parent() is implemented to warn the
>>> user that using the 'subtree_check' exportfs option is not supported.
>>>
>>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>>> ---
>>>  fs/overlayfs/export.c | 172 +++++++++++++++++++++++++++++++++++++++++++++++++-
>>>  1 file changed, 171 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
>>> index 5c72784a0b4d..48ae02f3acb8 100644
>>> --- a/fs/overlayfs/export.c
>>> +++ b/fs/overlayfs/export.c
>>> @@ -130,6 +130,145 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
>>>         return dentry;
>>>  }
>>>
>>> +/*
>>> + * Lookup a child overlay dentry whose real dentry is @real.
>>> + * If @is_upper is true then we lookup a child overlay dentry with the same
>>> + * name as the real dentry. Otherwise, we need to consult index for lookup.
>>> + */
>>> +static struct dentry *ovl_lookup_real_one(struct dentry *parent,
>>> +                                         struct dentry *real, bool is_upper)
>>> +{
>>> +       struct dentry *this;
>>> +       struct qstr *name = &real->d_name;
>>> +       int err;
>>> +
>>> +       /* TODO: use index when looking up by lower real dentry */
>>> +       if (!is_upper)
>>> +               return ERR_PTR(-EACCES);
>>> +
>>> +       /* Lookup overlay dentry by real name */
>>> +       this = lookup_one_len_unlocked(name->name, parent, name->len);
>>> +       err = PTR_ERR(this);
>>> +       if (IS_ERR(this)) {
>>> +               goto fail;
>>> +       } else if (!this || !this->d_inode) {
>>> +               dput(this);
>>> +               err = -ENOENT;
>>> +               goto fail;
>>> +       } else if (ovl_dentry_upper(this) != real) {
>>> +               dput(this);
>>> +               err = -ESTALE;
>>> +               goto fail;
>>> +       }
>>> +
>>> +       return this;
>>> +
>>> +fail:
>>> +       pr_warn_ratelimited("overlayfs: failed to lookup one by real (%pd2, is_upper=%d, parent=%pd2, err=%i)\n",
>>> +                           real, is_upper, parent, err);
>>> +       return ERR_PTR(err);
>>> +}
>>> +
>>> +/*
>>> + * Lookup an overlay dentry whose real dentry is @real.
>>> + * If @is_upper is true then we lookup an overlay dentry with the same path
>>> + * as the real dentry. Otherwise, we need to consult index for lookup.
>>> + */
>>> +static struct dentry *ovl_lookup_real(struct super_block *sb,
>>> +                                     struct dentry *real, bool is_upper)
>>> +{
>>> +       struct dentry *connected;
>>> +       int err = 0;
>>> +
>>> +       /* TODO: use index when looking up by lower real dentry */
>>> +       if (!is_upper)
>>> +               return ERR_PTR(-EACCES);
>>> +
>>> +       connected = dget(sb->s_root);
>>> +       while (!err) {
>>> +               struct dentry *next, *this;
>>> +               struct dentry *parent = NULL;
>>> +               struct dentry *real_connected = ovl_dentry_upper(connected);
>>> +
>>> +               if (real_connected == real)
>>> +                       break;
>>> +
>>> +               next = dget(real);
>>> +               /* find the topmost dentry not yet connected */
>>> +               for (;;) {
>>> +                       parent = dget_parent(next);
>>> +
>>> +                       if (real_connected == parent)
>>> +                               break;
>>> +
>>> +                       /*
>>> +                        * If real file has been moved out of the layer root
>>> +                        * directory, we will eventully hit the real fs root.
>>> +                        */
>>> +                       if (parent == next) {
>>> +                               err = -EXDEV;
>>> +                               break;
>>> +                       }
>>
>> This seems to assume no cross directory renames of directories in the
>> ancestry of "real", but AFAICS nothing prevents that.
>
> Do you mean online modification of underlying fs? or rename in overlay?

Rename in overlay.

> For online modification fo underlying fs, I don't a reason to make it work.
> -ESTALE would be a perfectly valid result in that case.

Sure.

>>
>> Also why not use the inode cache to find already connected dirs?
>> Seems more efficient, than always going up to the root and going down
>> from there.
>
> See patch [14/17] ovl: lookup connected ancestor of dir in inode cache
> Sorry for ordering patches like this, it was more convenient to implement
> the cold cache algorithm and then add hot cache into the mix.

Okay.

>>
>> So, a working algorithm would be going up to the first connected
>> parent or root, lock parent, lookup name and restart.  Not guaranteed
>> to finish, since not protected against always racing with renames.
>> Can we take s_vfs_rename_sem on ovl to prevent that?
>>
>
> Sounds like a simple and good enough solution.
> Do we really need the locking of parent and restart connect if
> we take s_vfs_rename_sem around ovl_lookup_real()?

No, but s_vfs_rename_sem is a really heavyweight solution, we should
do better than that for decoding a file handle.

And we probably don't need anything else, since rename on ancestor
means renamed dir is connected, and hopefully not evicted from the
cache until we repeat the walk up.

So need to lock parent, lookup ovl dentry, verify we got the same
upper, if not retry icache lookup.

Not sure we need to worry about that "hopefully".  Hopefully not.

Thanks,
Miklos
diff mbox

Patch

diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
index 5c72784a0b4d..48ae02f3acb8 100644
--- a/fs/overlayfs/export.c
+++ b/fs/overlayfs/export.c
@@ -130,6 +130,145 @@  static struct dentry *ovl_obtain_alias(struct super_block *sb,
 	return dentry;
 }
 
+/*
+ * Lookup a child overlay dentry whose real dentry is @real.
+ * If @is_upper is true then we lookup a child overlay dentry with the same
+ * name as the real dentry. Otherwise, we need to consult index for lookup.
+ */
+static struct dentry *ovl_lookup_real_one(struct dentry *parent,
+					  struct dentry *real, bool is_upper)
+{
+	struct dentry *this;
+	struct qstr *name = &real->d_name;
+	int err;
+
+	/* TODO: use index when looking up by lower real dentry */
+	if (!is_upper)
+		return ERR_PTR(-EACCES);
+
+	/* Lookup overlay dentry by real name */
+	this = lookup_one_len_unlocked(name->name, parent, name->len);
+	err = PTR_ERR(this);
+	if (IS_ERR(this)) {
+		goto fail;
+	} else if (!this || !this->d_inode) {
+		dput(this);
+		err = -ENOENT;
+		goto fail;
+	} else if (ovl_dentry_upper(this) != real) {
+		dput(this);
+		err = -ESTALE;
+		goto fail;
+	}
+
+	return this;
+
+fail:
+	pr_warn_ratelimited("overlayfs: failed to lookup one by real (%pd2, is_upper=%d, parent=%pd2, err=%i)\n",
+			    real, is_upper, parent, err);
+	return ERR_PTR(err);
+}
+
+/*
+ * Lookup an overlay dentry whose real dentry is @real.
+ * If @is_upper is true then we lookup an overlay dentry with the same path
+ * as the real dentry. Otherwise, we need to consult index for lookup.
+ */
+static struct dentry *ovl_lookup_real(struct super_block *sb,
+				      struct dentry *real, bool is_upper)
+{
+	struct dentry *connected;
+	int err = 0;
+
+	/* TODO: use index when looking up by lower real dentry */
+	if (!is_upper)
+		return ERR_PTR(-EACCES);
+
+	connected = dget(sb->s_root);
+	while (!err) {
+		struct dentry *next, *this;
+		struct dentry *parent = NULL;
+		struct dentry *real_connected = ovl_dentry_upper(connected);
+
+		if (real_connected == real)
+			break;
+
+		next = dget(real);
+		/* find the topmost dentry not yet connected */
+		for (;;) {
+			parent = dget_parent(next);
+
+			if (real_connected == parent)
+				break;
+
+			/*
+			 * If real file has been moved out of the layer root
+			 * directory, we will eventully hit the real fs root.
+			 */
+			if (parent == next) {
+				err = -EXDEV;
+				break;
+			}
+
+			dput(next);
+			next = parent;
+		}
+
+		if (!err) {
+			this = ovl_lookup_real_one(connected, next, is_upper);
+			if (!IS_ERR(this)) {
+				dput(connected);
+				connected = this;
+			} else {
+				err = PTR_ERR(this);
+			}
+		}
+
+		dput(parent);
+		dput(next);
+	}
+
+	if (err)
+		goto fail;
+
+	return connected;
+
+fail:
+	pr_warn_ratelimited("overlayfs: failed to lookup by real (%pd2, is_upper=%d, connected=%pd2, err=%i)\n",
+			    real, is_upper, connected, err);
+	dput(connected);
+	return ERR_PTR(err);
+}
+
+/*
+ * Get an overlay dentry from upper/lower real dentries.
+ */
+static struct dentry *ovl_get_dentry(struct super_block *sb,
+				     struct dentry *upper,
+				     struct ovl_path *lowerpath)
+{
+	/* TODO: get non-upper dentry */
+	if (!upper)
+		return ERR_PTR(-EACCES);
+
+	/*
+	 * Obtain a disconnected overlay dentry from a non-dir real upper
+	 * dentry.
+	 */
+	if (!d_is_dir(upper))
+		return ovl_obtain_alias(sb, upper, NULL);
+
+	/* Removed empty directory? */
+	if ((upper->d_flags & DCACHE_DISCONNECTED) || d_unhashed(upper))
+		return ERR_PTR(-ENOENT);
+
+	/*
+	 * If real upper dentry is connected and hashed, get a connected
+	 * overlay dentry with the same path as the real upper dentry.
+	 */
+	return ovl_lookup_real(sb, upper, true);
+}
+
 static struct dentry *ovl_upper_fh_to_d(struct super_block *sb,
 					struct ovl_fh *fh)
 {
@@ -144,7 +283,7 @@  static struct dentry *ovl_upper_fh_to_d(struct super_block *sb,
 	if (IS_ERR_OR_NULL(upper))
 		return upper;
 
-	dentry = ovl_obtain_alias(sb, upper, NULL);
+	dentry = ovl_get_dentry(sb, upper, NULL);
 	dput(upper);
 
 	return dentry;
@@ -183,7 +322,38 @@  static struct dentry *ovl_fh_to_dentry(struct super_block *sb, struct fid *fid,
 	return ERR_PTR(err);
 }
 
+static struct dentry *ovl_fh_to_parent(struct super_block *sb, struct fid *fid,
+				       int fh_len, int fh_type)
+{
+	pr_warn_ratelimited("overlayfs: connectable file handles not supported; use 'no_subtree_check' exportfs option.\n");
+	return ERR_PTR(-EACCES);
+}
+
+static int ovl_get_name(struct dentry *parent, char *name,
+			struct dentry *child)
+{
+	/*
+	 * ovl_fh_to_dentry() returns connected dir overlay dentries and
+	 * ovl_fh_to_parent() is not implemented, so we should not get here.
+	 */
+	WARN_ON_ONCE(1);
+	return -EIO;
+}
+
+static struct dentry *ovl_get_parent(struct dentry *dentry)
+{
+	/*
+	 * ovl_fh_to_dentry() returns connected dir overlay dentries, so we
+	 * should not get here.
+	 */
+	WARN_ON_ONCE(1);
+	return ERR_PTR(-EIO);
+}
+
 const struct export_operations ovl_export_operations = {
 	.encode_fh      = ovl_encode_inode_fh,
 	.fh_to_dentry	= ovl_fh_to_dentry,
+	.fh_to_parent	= ovl_fh_to_parent,
+	.get_name	= ovl_get_name,
+	.get_parent	= ovl_get_parent,
 };