diff mbox

[v2,07/11] ovl: set the COPYUP type flag for non-dirs

Message ID 1493025256-27188-8-git-send-email-amir73il@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Amir Goldstein April 24, 2017, 9:14 a.m. UTC
For directory entries, non zero oe->numlower implies OVL_TYPE_MERGE.
Define a new type flag OVL_TYPE_COPYUP to indicate that an entry is
a target of a copy up.

For directory entries COPYUP = MERGE && UPPER. For non-dir entries
non zero oe->numlower implies COPYUP, but COPYUP does not imply
non zero oe->numlower.  COPYUP can also be set on lookup when detecting
an overlay.fh xattr on a non-dir, even if that fh cannot be followed.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/namei.c     |  3 +++
 fs/overlayfs/overlayfs.h |  2 ++
 fs/overlayfs/util.c      | 12 ++++++++----
 3 files changed, 13 insertions(+), 4 deletions(-)

Comments

Miklos Szeredi April 26, 2017, 2:40 p.m. UTC | #1
On Mon, Apr 24, 2017 at 11:14 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> For directory entries, non zero oe->numlower implies OVL_TYPE_MERGE.
> Define a new type flag OVL_TYPE_COPYUP to indicate that an entry is
> a target of a copy up.
>
> For directory entries COPYUP = MERGE && UPPER. For non-dir entries
> non zero oe->numlower implies COPYUP, but COPYUP does not imply
> non zero oe->numlower.  COPYUP can also be set on lookup when detecting
> an overlay.fh xattr on a non-dir, even if that fh cannot be followed.
>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/overlayfs/namei.c     |  3 +++
>  fs/overlayfs/overlayfs.h |  2 ++
>  fs/overlayfs/util.c      | 12 ++++++++----
>  3 files changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> index 318092a..73a8879 100644
> --- a/fs/overlayfs/namei.c
> +++ b/fs/overlayfs/namei.c
> @@ -386,6 +386,9 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                 }
>                 if (d.opaque)
>                         type |= __OVL_PATH_OPAQUE;
> +               /* overlay.fh xattr implies this is a copy up */
> +               if (d.fh)
> +                       type |= __OVL_PATH_COPYUP;
>         }
>
>         /*
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 08002ce..d0bb538 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -13,11 +13,13 @@ enum ovl_path_type {
>         __OVL_PATH_UPPER        = (1 << 0),
>         __OVL_PATH_MERGE        = (1 << 1),
>         __OVL_PATH_OPAQUE       = (1 << 2),
> +       __OVL_PATH_COPYUP       = (1 << 3),
>  };
>
>  #define OVL_TYPE_UPPER(type)   ((type) & __OVL_PATH_UPPER)
>  #define OVL_TYPE_MERGE(type)   ((type) & __OVL_PATH_MERGE)
>  #define OVL_TYPE_OPAQUE(type)  ((type) & __OVL_PATH_OPAQUE)
> +#define OVL_TYPE_COPYUP(type)  ((type) & __OVL_PATH_COPYUP)
>
>  #define OVL_XATTR_PREFIX XATTR_TRUSTED_PREFIX "overlay."
>  #define OVL_XATTR_OPAQUE OVL_XATTR_PREFIX "opaque"
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index dba9753..89789bc 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -101,11 +101,15 @@ enum ovl_path_type ovl_update_type(struct dentry *dentry, bool is_dir)
>         if (oe->__upperdentry) {
>                 type |= __OVL_PATH_UPPER;
>                 /*
> -                * Non-dir dentry can hold lower dentry from before
> -                * copy-up.
> +                * oe->numlower implies a copy up, but copy up does not imply
> +                * oe->numlower.  It can also be set on lookup when detecting
> +                * an overlay.fh xattr on a non-dir that cannot be followed.

The code looks fine, but I don't understand the comment.  Why would we
set COPYUP flag when the fh cannot be followed?

The reason I think the COPYUP vs. MERGE distinction is needed is the
ovl_check_empty_and_clear() thing.  It starts with a merged directory
with some whiteouts in it and exchanges it with an empty and opaque
directory.   Normally the empty directory will be deleted immediately,
but if something fails during the deletion, then it will remain there.
  The overlay is left in a consistent state, but the association with
the original inode should still remain, so it will have COPYUP but not
MERGE.

Now the current code is actually broken, because we leave the old,
replaced directory in __upperdentry as well as the rest of the lower
stack.  So should the deletion fail after the replacement things won't
work properly.

I think we can fix that by replacing __upperdentry.  Luckily we are
under inode lock, so protected against concurrent readdir or creation
inside the directory.  Then we have lifetime problems.  Until now a
positive __upperdentry was assumed to have a lifetime equal to that of
the overlay dentry.  We'd need an old_upperdentry to save it.  I think
that's it it, but maybe there are other issues.

Thanks,
Miklos
Miklos Szeredi April 26, 2017, 2:53 p.m. UTC | #2
On Wed, Apr 26, 2017 at 4:40 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:

> The reason I think the COPYUP vs. MERGE distinction is needed is the
> ovl_check_empty_and_clear() thing.  It starts with a merged directory
> with some whiteouts in it and exchanges it with an empty and opaque
> directory.   Normally the empty directory will be deleted immediately,
> but if something fails during the deletion, then it will remain there.
>   The overlay is left in a consistent state, but the association with
> the original inode should still remain, so it will have COPYUP but not
> MERGE.

One more thought: we could introduce a separate "overlay.merge"
attribute that is the exact opposite of "overlay.opaque".
"overlay.merge" would imply "overlay.fh" but "overlay.fh" would not
imply "overlay.merge".

It would allow us to optionally get rid of "overlay.opaque" when back
compatibility is not needed.

It would also allow a new feature: on metadata only updates of regular
files we wouldn't need to copy up the data.

Thanks,
Miklos
Amir Goldstein April 26, 2017, 2:57 p.m. UTC | #3
On Wed, Apr 26, 2017 at 5:40 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Mon, Apr 24, 2017 at 11:14 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>> For directory entries, non zero oe->numlower implies OVL_TYPE_MERGE.
>> Define a new type flag OVL_TYPE_COPYUP to indicate that an entry is
>> a target of a copy up.
>>
>> For directory entries COPYUP = MERGE && UPPER. For non-dir entries
>> non zero oe->numlower implies COPYUP, but COPYUP does not imply
>> non zero oe->numlower.  COPYUP can also be set on lookup when detecting
>> an overlay.fh xattr on a non-dir, even if that fh cannot be followed.
>>
>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>> ---
>>  fs/overlayfs/namei.c     |  3 +++
>>  fs/overlayfs/overlayfs.h |  2 ++
>>  fs/overlayfs/util.c      | 12 ++++++++----
>>  3 files changed, 13 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
>> index 318092a..73a8879 100644
>> --- a/fs/overlayfs/namei.c
>> +++ b/fs/overlayfs/namei.c
>> @@ -386,6 +386,9 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>>                 }
>>                 if (d.opaque)
>>                         type |= __OVL_PATH_OPAQUE;
>> +               /* overlay.fh xattr implies this is a copy up */
>> +               if (d.fh)
>> +                       type |= __OVL_PATH_COPYUP;
>>         }
>>
>>         /*
>> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
>> index 08002ce..d0bb538 100644
>> --- a/fs/overlayfs/overlayfs.h
>> +++ b/fs/overlayfs/overlayfs.h
>> @@ -13,11 +13,13 @@ enum ovl_path_type {
>>         __OVL_PATH_UPPER        = (1 << 0),
>>         __OVL_PATH_MERGE        = (1 << 1),
>>         __OVL_PATH_OPAQUE       = (1 << 2),
>> +       __OVL_PATH_COPYUP       = (1 << 3),
>>  };
>>
>>  #define OVL_TYPE_UPPER(type)   ((type) & __OVL_PATH_UPPER)
>>  #define OVL_TYPE_MERGE(type)   ((type) & __OVL_PATH_MERGE)
>>  #define OVL_TYPE_OPAQUE(type)  ((type) & __OVL_PATH_OPAQUE)
>> +#define OVL_TYPE_COPYUP(type)  ((type) & __OVL_PATH_COPYUP)
>>
>>  #define OVL_XATTR_PREFIX XATTR_TRUSTED_PREFIX "overlay."
>>  #define OVL_XATTR_OPAQUE OVL_XATTR_PREFIX "opaque"
>> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
>> index dba9753..89789bc 100644
>> --- a/fs/overlayfs/util.c
>> +++ b/fs/overlayfs/util.c
>> @@ -101,11 +101,15 @@ enum ovl_path_type ovl_update_type(struct dentry *dentry, bool is_dir)
>>         if (oe->__upperdentry) {
>>                 type |= __OVL_PATH_UPPER;
>>                 /*
>> -                * Non-dir dentry can hold lower dentry from before
>> -                * copy-up.
>> +                * oe->numlower implies a copy up, but copy up does not imply
>> +                * oe->numlower.  It can also be set on lookup when detecting
>> +                * an overlay.fh xattr on a non-dir that cannot be followed.
>
> The code looks fine, but I don't understand the comment.  Why would we
> set COPYUP flag when the fh cannot be followed?
>

See patch #8 ovl: redirect non-dir by path on rename

overlay.fh *is* the indication of a copy up and non-dir copy ups
should be redirected
on rename.

With this, copying the layers will not break the constant inode property.
upper files will continue to follow by path to origin and report the
new (post copy)
stable inode number.
Amir Goldstein April 26, 2017, 3:02 p.m. UTC | #4
On Wed, Apr 26, 2017 at 5:53 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Wed, Apr 26, 2017 at 4:40 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>
>> The reason I think the COPYUP vs. MERGE distinction is needed is the
>> ovl_check_empty_and_clear() thing.  It starts with a merged directory
>> with some whiteouts in it and exchanges it with an empty and opaque
>> directory.   Normally the empty directory will be deleted immediately,
>> but if something fails during the deletion, then it will remain there.
>>   The overlay is left in a consistent state, but the association with
>> the original inode should still remain, so it will have COPYUP but not
>> MERGE.
>
> One more thought: we could introduce a separate "overlay.merge"
> attribute that is the exact opposite of "overlay.opaque".
> "overlay.merge" would imply "overlay.fh" but "overlay.fh" would not
> imply "overlay.merge".
>
> It would allow us to optionally get rid of "overlay.opaque" when back
> compatibility is not needed.
>
> It would also allow a new feature: on metadata only updates of regular
> files we wouldn't need to copy up the data.
>

So you intend to set overlay.merge for non-dir?
How is it different from overlay.fh then?
With it's new name, overlay.origin.fh indicates that there is a copy
up origin below us. Either directly below us, or at overlay.redirect.
We can also try to follow to origin by fh, but that is only an optimization -
an important optimization IMO, because file rename are more common
than dir renames and lookup stable inode by fh in a deep directory
with many layers will be much more efficient by fh.

Are we understanding each other w.r.t. overlay.merge vs overlay.fh?
Amir Goldstein April 26, 2017, 6:51 p.m. UTC | #5
On Wed, Apr 26, 2017 at 6:02 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Wed, Apr 26, 2017 at 5:53 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Wed, Apr 26, 2017 at 4:40 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>
>>> The reason I think the COPYUP vs. MERGE distinction is needed is the
>>> ovl_check_empty_and_clear() thing.  It starts with a merged directory
>>> with some whiteouts in it and exchanges it with an empty and opaque
>>> directory.   Normally the empty directory will be deleted immediately,
>>> but if something fails during the deletion, then it will remain there.
>>>   The overlay is left in a consistent state, but the association with
>>> the original inode should still remain, so it will have COPYUP but not
>>> MERGE.
>>
>> One more thought: we could introduce a separate "overlay.merge"
>> attribute that is the exact opposite of "overlay.opaque".
>> "overlay.merge" would imply "overlay.fh" but "overlay.fh" would not
>> imply "overlay.merge".
>>
>> It would allow us to optionally get rid of "overlay.opaque" when back
>> compatibility is not needed.
>>
>> It would also allow a new feature: on metadata only updates of regular
>> files we wouldn't need to copy up the data.
>>
>
> So you intend to set overlay.merge for non-dir?
> How is it different from overlay.fh then?
> With it's new name, overlay.origin.fh indicates that there is a copy
> up origin below us. Either directly below us, or at overlay.redirect.
> We can also try to follow to origin by fh, but that is only an optimization -

I miss-spoke - redirect_fh to origin is not only as optimization.
Although renames do not depend on redirect_fh, hardlinks do.

As I learned from improved unionmount-testsuite:

./run --ov=1 hard-link
...
 ./run --link /mnt/a/no_foo110 /mnt/a/foo110
mount -t overlay overlay /mnt
-olowerdir=/upper/0:/lower,upperdir=/upper/1,workdir=/upper/work
sh (8035): drop_caches: 3
/mnt/a/foo110: inode number wrong (got 68908, want 68898)

This error happens in non-samefs case when there is more than 1 lower layer
and redirect_fh is disabled.

It happens after link and mount cycle because the linked upper file, does not
know how to lookup the lower origin.

The error does not happen with samefs and with single lower fs, i.e.:
./run --ov=0 hard-link
./run --ov=1 --samefs hard-link

Because in those cases, all the upper hardlinks follow to origin by fh
and report the same inode number.

I think this calls for setting overlay.redirect also on the target of
ovl_link()??

Amir.
Miklos Szeredi April 27, 2017, 9:32 a.m. UTC | #6
On Wed, Apr 26, 2017 at 5:02 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Wed, Apr 26, 2017 at 5:53 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Wed, Apr 26, 2017 at 4:40 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>
>>> The reason I think the COPYUP vs. MERGE distinction is needed is the
>>> ovl_check_empty_and_clear() thing.  It starts with a merged directory
>>> with some whiteouts in it and exchanges it with an empty and opaque
>>> directory.   Normally the empty directory will be deleted immediately,
>>> but if something fails during the deletion, then it will remain there.
>>>   The overlay is left in a consistent state, but the association with
>>> the original inode should still remain, so it will have COPYUP but not
>>> MERGE.
>>
>> One more thought: we could introduce a separate "overlay.merge"
>> attribute that is the exact opposite of "overlay.opaque".
>> "overlay.merge" would imply "overlay.fh" but "overlay.fh" would not
>> imply "overlay.merge".
>>
>> It would allow us to optionally get rid of "overlay.opaque" when back
>> compatibility is not needed.
>>
>> It would also allow a new feature: on metadata only updates of regular
>> files we wouldn't need to copy up the data.
>>
>
> So you intend to set overlay.merge for non-dir?

Nope, not by default.

> How is it different from overlay.fh then?

It would make sense for regular files for the non-samefs or non-clone
fs cases if only metadata (attr, xattr) are modified but data is not.
We'd create an empty file with the copied up metadata and
"overlay.merge" set indicating that the data I/O should still be
redirected to the origin, while metadata is kept in the copied up
file.  This can be upgraded to a fully copied-up file later.

Not something for this series, obviously.

Thanks,
Miklos
diff mbox

Patch

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 318092a..73a8879 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -386,6 +386,9 @@  struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		}
 		if (d.opaque)
 			type |= __OVL_PATH_OPAQUE;
+		/* overlay.fh xattr implies this is a copy up */
+		if (d.fh)
+			type |= __OVL_PATH_COPYUP;
 	}
 
 	/*
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 08002ce..d0bb538 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -13,11 +13,13 @@  enum ovl_path_type {
 	__OVL_PATH_UPPER	= (1 << 0),
 	__OVL_PATH_MERGE	= (1 << 1),
 	__OVL_PATH_OPAQUE	= (1 << 2),
+	__OVL_PATH_COPYUP	= (1 << 3),
 };
 
 #define OVL_TYPE_UPPER(type)	((type) & __OVL_PATH_UPPER)
 #define OVL_TYPE_MERGE(type)	((type) & __OVL_PATH_MERGE)
 #define OVL_TYPE_OPAQUE(type)	((type) & __OVL_PATH_OPAQUE)
+#define OVL_TYPE_COPYUP(type)	((type) & __OVL_PATH_COPYUP)
 
 #define OVL_XATTR_PREFIX XATTR_TRUSTED_PREFIX "overlay."
 #define OVL_XATTR_OPAQUE OVL_XATTR_PREFIX "opaque"
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index dba9753..89789bc 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -101,11 +101,15 @@  enum ovl_path_type ovl_update_type(struct dentry *dentry, bool is_dir)
 	if (oe->__upperdentry) {
 		type |= __OVL_PATH_UPPER;
 		/*
-		 * Non-dir dentry can hold lower dentry from before
-		 * copy-up.
+		 * oe->numlower implies a copy up, but copy up does not imply
+		 * oe->numlower.  It can also be set on lookup when detecting
+		 * an overlay.fh xattr on a non-dir that cannot be followed.
 		 */
-		if (oe->numlower && is_dir)
-			type |= __OVL_PATH_MERGE;
+		if (oe->numlower) {
+			type |= __OVL_PATH_COPYUP;
+			if (is_dir)
+				type |= __OVL_PATH_MERGE;
+		}
 	} else {
 		if (oe->numlower > 1)
 			type |= __OVL_PATH_MERGE;