diff mbox

[2/2] generic/066: add _require_metadata_replay

Message ID 1425000227-69601-2-git-send-email-jaegeuk@kernel.org (mailing list archive)
State New, archived
Headers show

Commit Message

Jaegeuk Kim Feb. 27, 2015, 1:23 a.m. UTC
This patch adds _require_metadata_replay to detect whether or not filesystem
supports metadata replay.

This should be used when:

1. create file A
2. write file A
3. fsync file A

4. write file A

5. create file B
6. fsync file B
7. crash!

In this case, if filesystem supports metadata_replay, file A's data written
by #4 should be recovered.
Otherwise, file A is recovered to #3.

Cc: Filipe Manana <fdmanana@suse.com>
Cc: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 common/rc         | 18 ++++++++++++++++++
 tests/generic/066 |  2 +-
 2 files changed, 19 insertions(+), 1 deletion(-)

Comments

Lukas Czerner Feb. 27, 2015, 11:34 a.m. UTC | #1
On Thu, 26 Feb 2015, Jaegeuk Kim wrote:

> Date: Thu, 26 Feb 2015 17:23:47 -0800
> From: Jaegeuk Kim <jaegeuk@kernel.org>
> To: Dave Chinner <david@fromorbit.com>
> Cc: fstests@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net,
>     Jaegeuk Kim <jaegeuk@kernel.org>, Filipe Manana <fdmanana@suse.com>,
>     Eric Sandeen <sandeen@redhat.com>
> Subject: [PATCH 2/2] generic/066: add _require_metadata_replay
> 
> This patch adds _require_metadata_replay to detect whether or not filesystem
> supports metadata replay.
> 
> This should be used when:
> 
> 1. create file A
> 2. write file A
> 3. fsync file A
> 
> 4. write file A
> 
> 5. create file B
> 6. fsync file B
> 7. crash!
> 
> In this case, if filesystem supports metadata_replay, file A's data written
> by #4 should be recovered.

What ? No it should not! file A was never fsync after the last
write, so there is no guarantee that we'll have the new content.
"Metadata replay" suggests that we only replay metadata, not data
and that's what file systems usually do with they journals, cows,
and so on...

But this test is not about writing to a file, it's about changing
file xattr. And while extended attributes are metadata, they are
_not_ file system metadata and as such I think it can be very well
treated as data. Hence explicit fsync is needed to make sure it's
written to the disk.

Now let's look at the test itself:


	echo "hello world!" >> $SCRATCH_MNT/foobar
	$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar
	$SETFATTR_PROG -x user.attr1 $SCRATCH_MNT/foobar
	ln $SCRATCH_MNT/foobar $SCRATCH_MNT/foobar_link
	touch $SCRATCH_MNT/qwerty
	$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/qwerty

	_crash_and_mount

	# Now only the xattr with name user.attr3 should be set in our file.
	echo "xattr names and values after second fsync log replay:"
	$GETFATTR_PROG --absolute-names --dump $SCRATCH_MNT/foobar |
	_filter_scratch

No it should not have only attr3 because $SCRATCH_MNT/foobar was
never fsynced after attr3 removal. If the internal design of the
file system accidentally synces unrelated xattrs on unrelated fsync,
than that's fine, but that's not what we need to test for at all
because it might not be guaranteed by all the file systems. So I
think this last part of the test is rather pointless. So I think
this patch is not needed as well.

Or am I missing something Filipe ?

Thanks!
-Lukas

> Otherwise, file A is recovered to #3.


> 
> Cc: Filipe Manana <fdmanana@suse.com>
> Cc: Eric Sandeen <sandeen@redhat.com>
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> ---
>  common/rc         | 18 ++++++++++++++++++
>  tests/generic/066 |  2 +-
>  2 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/common/rc b/common/rc
> index 1ed9df5..e6e8d1f 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -2372,6 +2372,24 @@ _require_metadata_journaling()
>  	esac
>  }
>  
> +# Does this filesystem support metadata replay?
> +# Filesystem is able to recover metadata which were not written by fsync
> +# exlicitly. But another fsync'ed metadata should be followed by them.
> +_require_metadata_replay()
> +{
> +	_require_metadata_journaling $1
> +
> +	case "$FSTYP" in
> +	f2fs)
> +		# f2fs supports metadata_journaling, but does not recover any
> +		# intermediate metadata which was not fsync'ed explicitly.
> +		_notrun "$FSTYP does not support metadata replay"
> +		;;
> +	*)
> +		;;
> +	esac
> +}
> +
>  # Does fiemap support?
>  _require_fiemap()
>  {
> diff --git a/tests/generic/066 b/tests/generic/066
> index cb36506..3fefda4 100755
> --- a/tests/generic/066
> +++ b/tests/generic/066
> @@ -61,7 +61,7 @@ _need_to_be_root
>  _require_scratch
>  _require_dm_flakey
>  _require_attrs
> -_require_metadata_journaling $SCRATCH_DEV
> +_require_metadata_replay $SCRATCH_DEV
>  
>  _crash_and_mount()
>  {
> 
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Filipe Manana Feb. 27, 2015, 11:43 a.m. UTC | #2
On Fri, Feb 27, 2015 at 11:34 AM, Lukáš Czerner <lczerner@redhat.com> wrote:
> On Thu, 26 Feb 2015, Jaegeuk Kim wrote:
>
>> Date: Thu, 26 Feb 2015 17:23:47 -0800
>> From: Jaegeuk Kim <jaegeuk@kernel.org>
>> To: Dave Chinner <david@fromorbit.com>
>> Cc: fstests@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net,
>>     Jaegeuk Kim <jaegeuk@kernel.org>, Filipe Manana <fdmanana@suse.com>,
>>     Eric Sandeen <sandeen@redhat.com>
>> Subject: [PATCH 2/2] generic/066: add _require_metadata_replay
>>
>> This patch adds _require_metadata_replay to detect whether or not filesystem
>> supports metadata replay.
>>
>> This should be used when:
>>
>> 1. create file A
>> 2. write file A
>> 3. fsync file A
>>
>> 4. write file A
>>
>> 5. create file B
>> 6. fsync file B
>> 7. crash!
>>
>> In this case, if filesystem supports metadata_replay, file A's data written
>> by #4 should be recovered.
>
> What ? No it should not! file A was never fsync after the last
> write, so there is no guarantee that we'll have the new content.
> "Metadata replay" suggests that we only replay metadata, not data
> and that's what file systems usually do with they journals, cows,
> and so on...
>
> But this test is not about writing to a file, it's about changing
> file xattr. And while extended attributes are metadata, they are
> _not_ file system metadata and as such I think it can be very well
> treated as data. Hence explicit fsync is needed to make sure it's
> written to the disk.
>
> Now let's look at the test itself:
>
>
>         echo "hello world!" >> $SCRATCH_MNT/foobar
>         $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar
>         $SETFATTR_PROG -x user.attr1 $SCRATCH_MNT/foobar
>         ln $SCRATCH_MNT/foobar $SCRATCH_MNT/foobar_link
>         touch $SCRATCH_MNT/qwerty
>         $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/qwerty
>
>         _crash_and_mount
>
>         # Now only the xattr with name user.attr3 should be set in our file.
>         echo "xattr names and values after second fsync log replay:"
>         $GETFATTR_PROG --absolute-names --dump $SCRATCH_MNT/foobar |
>         _filter_scratch
>
> No it should not have only attr3 because $SCRATCH_MNT/foobar was
> never fsynced after attr3 removal. If the internal design of the
> file system accidentally synces unrelated xattrs on unrelated fsync,
> than that's fine, but that's not what we need to test for at all
> because it might not be guaranteed by all the file systems. So I
> think this last part of the test is rather pointless. So I think
> this patch is not needed as well.
>
> Or am I missing something Filipe ?

So that second part of the test, essentially comes from this ordered
metadata dependency explanation Dave gave me recently on another
thread:

http://www.spinics.net/lists/linux-btrfs/msg42086.html

Yes, I'm considering xattrs as metadata (even though they can be seen
as data as well). This behaviour I'm testing for applies to ext3/4 and
xfs for example (and apparently intentional, since the test passes on
these filesystems).

Thanks

>
> Thanks!
> -Lukas
>
>> Otherwise, file A is recovered to #3.
>
>
>>
>> Cc: Filipe Manana <fdmanana@suse.com>
>> Cc: Eric Sandeen <sandeen@redhat.com>
>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>> ---
>>  common/rc         | 18 ++++++++++++++++++
>>  tests/generic/066 |  2 +-
>>  2 files changed, 19 insertions(+), 1 deletion(-)
>>
>> diff --git a/common/rc b/common/rc
>> index 1ed9df5..e6e8d1f 100644
>> --- a/common/rc
>> +++ b/common/rc
>> @@ -2372,6 +2372,24 @@ _require_metadata_journaling()
>>       esac
>>  }
>>
>> +# Does this filesystem support metadata replay?
>> +# Filesystem is able to recover metadata which were not written by fsync
>> +# exlicitly. But another fsync'ed metadata should be followed by them.
>> +_require_metadata_replay()
>> +{
>> +     _require_metadata_journaling $1
>> +
>> +     case "$FSTYP" in
>> +     f2fs)
>> +             # f2fs supports metadata_journaling, but does not recover any
>> +             # intermediate metadata which was not fsync'ed explicitly.
>> +             _notrun "$FSTYP does not support metadata replay"
>> +             ;;
>> +     *)
>> +             ;;
>> +     esac
>> +}
>> +
>>  # Does fiemap support?
>>  _require_fiemap()
>>  {
>> diff --git a/tests/generic/066 b/tests/generic/066
>> index cb36506..3fefda4 100755
>> --- a/tests/generic/066
>> +++ b/tests/generic/066
>> @@ -61,7 +61,7 @@ _need_to_be_root
>>  _require_scratch
>>  _require_dm_flakey
>>  _require_attrs
>> -_require_metadata_journaling $SCRATCH_DEV
>> +_require_metadata_replay $SCRATCH_DEV
>>
>>  _crash_and_mount()
>>  {
>>
>
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming The Go Parallel Website, sponsored
> by Intel and developed in partnership with Slashdot Media, is your hub for all
> things parallel software development, from weekly thought leadership blogs to
> news, videos, case studies, tutorials and more. Take a look and join the
> conversation now. http://goparallel.sourceforge.net/
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Lukas Czerner Feb. 27, 2015, 1:10 p.m. UTC | #3
On Fri, 27 Feb 2015, Filipe David Manana wrote:

> Date: Fri, 27 Feb 2015 11:43:36 +0000
> From: Filipe David Manana <fdmanana@gmail.com>
> To: Lukáš Czerner <lczerner@redhat.com>
> Cc: Jaegeuk Kim <jaegeuk@kernel.org>, Filipe Manana <fdmanana@suse.com>,
>     Eric Sandeen <sandeen@redhat.com>, Dave Chinner <david@fromorbit.com>,
>     fstests@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net
> Subject: Re: [f2fs-dev] [PATCH 2/2] generic/066: add _require_metadata_replay
> 
> On Fri, Feb 27, 2015 at 11:34 AM, Lukáš Czerner <lczerner@redhat.com> wrote:
> > On Thu, 26 Feb 2015, Jaegeuk Kim wrote:
> >
> >> Date: Thu, 26 Feb 2015 17:23:47 -0800
> >> From: Jaegeuk Kim <jaegeuk@kernel.org>
> >> To: Dave Chinner <david@fromorbit.com>
> >> Cc: fstests@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net,
> >>     Jaegeuk Kim <jaegeuk@kernel.org>, Filipe Manana <fdmanana@suse.com>,
> >>     Eric Sandeen <sandeen@redhat.com>
> >> Subject: [PATCH 2/2] generic/066: add _require_metadata_replay
> >>
> >> This patch adds _require_metadata_replay to detect whether or not filesystem
> >> supports metadata replay.
> >>
> >> This should be used when:
> >>
> >> 1. create file A
> >> 2. write file A
> >> 3. fsync file A
> >>
> >> 4. write file A
> >>
> >> 5. create file B
> >> 6. fsync file B
> >> 7. crash!
> >>
> >> In this case, if filesystem supports metadata_replay, file A's data written
> >> by #4 should be recovered.
> >
> > What ? No it should not! file A was never fsync after the last
> > write, so there is no guarantee that we'll have the new content.
> > "Metadata replay" suggests that we only replay metadata, not data
> > and that's what file systems usually do with they journals, cows,
> > and so on...
> >
> > But this test is not about writing to a file, it's about changing
> > file xattr. And while extended attributes are metadata, they are
> > _not_ file system metadata and as such I think it can be very well
> > treated as data. Hence explicit fsync is needed to make sure it's
> > written to the disk.
> >
> > Now let's look at the test itself:
> >
> >
> >         echo "hello world!" >> $SCRATCH_MNT/foobar
> >         $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar
> >         $SETFATTR_PROG -x user.attr1 $SCRATCH_MNT/foobar
> >         ln $SCRATCH_MNT/foobar $SCRATCH_MNT/foobar_link
> >         touch $SCRATCH_MNT/qwerty
> >         $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/qwerty
> >
> >         _crash_and_mount
> >
> >         # Now only the xattr with name user.attr3 should be set in our file.
> >         echo "xattr names and values after second fsync log replay:"
> >         $GETFATTR_PROG --absolute-names --dump $SCRATCH_MNT/foobar |
> >         _filter_scratch
> >
> > No it should not have only attr3 because $SCRATCH_MNT/foobar was
> > never fsynced after attr3 removal. If the internal design of the
> > file system accidentally synces unrelated xattrs on unrelated fsync,
> > than that's fine, but that's not what we need to test for at all
> > because it might not be guaranteed by all the file systems. So I
> > think this last part of the test is rather pointless. So I think
> > this patch is not needed as well.
> >
> > Or am I missing something Filipe ?
> 
> So that second part of the test, essentially comes from this ordered
> metadata dependency explanation Dave gave me recently on another
> thread:
> 
> http://www.spinics.net/lists/linux-btrfs/msg42086.html

It's interesting, but it really applies only to metadata updates
since really we normally only journal metadata. We do not
consider extended attributes to be metadata, do we ?

> 
> Yes, I'm considering xattrs as metadata (even though they can be seen
> as data as well). This behaviour I'm testing for applies to ext3/4 and
> xfs for example (and apparently intentional, since the test passes on
> these filesystems).

Ok, I am confused. Clearly ext4, nor xfs consider xattrs metadata
which can be tested simply by attaching xattr and crashing the file
system immediately afterwards - the new xattr will not be there -
that's expected for data, but unexpected for metadata.

Now the fact that it works might be just a coincidence. Btw in the
discussion Dave never mentioned xattr, he only talks about inode
size and extent list changes which makes sense since those are
metadata and it's expected to be "stabilised" as he very well
described. I just do not think this applies to this case.

Also I think that his wording that fsync on the file implies fsync
on the directory is unfortunate because it does not. However it
implies that the directory will actually be stabilised as well due
to journalling. But the results are the same.

Anyway, maybe Dave can sprinkle some of his wisdom over this...

Thanks!
-Lukas

> 
> Thanks
> 
> >
> > Thanks!
> > -Lukas
> >
> >> Otherwise, file A is recovered to #3.
> >
> >
> >>
> >> Cc: Filipe Manana <fdmanana@suse.com>
> >> Cc: Eric Sandeen <sandeen@redhat.com>
> >> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >> ---
> >>  common/rc         | 18 ++++++++++++++++++
> >>  tests/generic/066 |  2 +-
> >>  2 files changed, 19 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/common/rc b/common/rc
> >> index 1ed9df5..e6e8d1f 100644
> >> --- a/common/rc
> >> +++ b/common/rc
> >> @@ -2372,6 +2372,24 @@ _require_metadata_journaling()
> >>       esac
> >>  }
> >>
> >> +# Does this filesystem support metadata replay?
> >> +# Filesystem is able to recover metadata which were not written by fsync
> >> +# exlicitly. But another fsync'ed metadata should be followed by them.
> >> +_require_metadata_replay()
> >> +{
> >> +     _require_metadata_journaling $1
> >> +
> >> +     case "$FSTYP" in
> >> +     f2fs)
> >> +             # f2fs supports metadata_journaling, but does not recover any
> >> +             # intermediate metadata which was not fsync'ed explicitly.
> >> +             _notrun "$FSTYP does not support metadata replay"
> >> +             ;;
> >> +     *)
> >> +             ;;
> >> +     esac
> >> +}
> >> +
> >>  # Does fiemap support?
> >>  _require_fiemap()
> >>  {
> >> diff --git a/tests/generic/066 b/tests/generic/066
> >> index cb36506..3fefda4 100755
> >> --- a/tests/generic/066
> >> +++ b/tests/generic/066
> >> @@ -61,7 +61,7 @@ _need_to_be_root
> >>  _require_scratch
> >>  _require_dm_flakey
> >>  _require_attrs
> >> -_require_metadata_journaling $SCRATCH_DEV
> >> +_require_metadata_replay $SCRATCH_DEV
> >>
> >>  _crash_and_mount()
> >>  {
> >>
> >
> > ------------------------------------------------------------------------------
> > Dive into the World of Parallel Programming The Go Parallel Website, sponsored
> > by Intel and developed in partnership with Slashdot Media, is your hub for all
> > things parallel software development, from weekly thought leadership blogs to
> > news, videos, case studies, tutorials and more. Take a look and join the
> > conversation now. http://goparallel.sourceforge.net/
> > _______________________________________________
> > Linux-f2fs-devel mailing list
> > Linux-f2fs-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> 
> 
> 
>
Filipe Manana Feb. 27, 2015, 2:34 p.m. UTC | #4
On Fri, Feb 27, 2015 at 1:10 PM, Lukáš Czerner <lczerner@redhat.com> wrote:
> On Fri, 27 Feb 2015, Filipe David Manana wrote:
>
>> Date: Fri, 27 Feb 2015 11:43:36 +0000
>> From: Filipe David Manana <fdmanana@gmail.com>
>> To: Lukáš Czerner <lczerner@redhat.com>
>> Cc: Jaegeuk Kim <jaegeuk@kernel.org>, Filipe Manana <fdmanana@suse.com>,
>>     Eric Sandeen <sandeen@redhat.com>, Dave Chinner <david@fromorbit.com>,
>>     fstests@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net
>> Subject: Re: [f2fs-dev] [PATCH 2/2] generic/066: add _require_metadata_replay
>>
>> On Fri, Feb 27, 2015 at 11:34 AM, Lukáš Czerner <lczerner@redhat.com> wrote:
>> > On Thu, 26 Feb 2015, Jaegeuk Kim wrote:
>> >
>> >> Date: Thu, 26 Feb 2015 17:23:47 -0800
>> >> From: Jaegeuk Kim <jaegeuk@kernel.org>
>> >> To: Dave Chinner <david@fromorbit.com>
>> >> Cc: fstests@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net,
>> >>     Jaegeuk Kim <jaegeuk@kernel.org>, Filipe Manana <fdmanana@suse.com>,
>> >>     Eric Sandeen <sandeen@redhat.com>
>> >> Subject: [PATCH 2/2] generic/066: add _require_metadata_replay
>> >>
>> >> This patch adds _require_metadata_replay to detect whether or not filesystem
>> >> supports metadata replay.
>> >>
>> >> This should be used when:
>> >>
>> >> 1. create file A
>> >> 2. write file A
>> >> 3. fsync file A
>> >>
>> >> 4. write file A
>> >>
>> >> 5. create file B
>> >> 6. fsync file B
>> >> 7. crash!
>> >>
>> >> In this case, if filesystem supports metadata_replay, file A's data written
>> >> by #4 should be recovered.
>> >
>> > What ? No it should not! file A was never fsync after the last
>> > write, so there is no guarantee that we'll have the new content.
>> > "Metadata replay" suggests that we only replay metadata, not data
>> > and that's what file systems usually do with they journals, cows,
>> > and so on...
>> >
>> > But this test is not about writing to a file, it's about changing
>> > file xattr. And while extended attributes are metadata, they are
>> > _not_ file system metadata and as such I think it can be very well
>> > treated as data. Hence explicit fsync is needed to make sure it's
>> > written to the disk.
>> >
>> > Now let's look at the test itself:
>> >
>> >
>> >         echo "hello world!" >> $SCRATCH_MNT/foobar
>> >         $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar
>> >         $SETFATTR_PROG -x user.attr1 $SCRATCH_MNT/foobar
>> >         ln $SCRATCH_MNT/foobar $SCRATCH_MNT/foobar_link
>> >         touch $SCRATCH_MNT/qwerty
>> >         $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/qwerty
>> >
>> >         _crash_and_mount
>> >
>> >         # Now only the xattr with name user.attr3 should be set in our file.
>> >         echo "xattr names and values after second fsync log replay:"
>> >         $GETFATTR_PROG --absolute-names --dump $SCRATCH_MNT/foobar |
>> >         _filter_scratch
>> >
>> > No it should not have only attr3 because $SCRATCH_MNT/foobar was
>> > never fsynced after attr3 removal. If the internal design of the
>> > file system accidentally synces unrelated xattrs on unrelated fsync,
>> > than that's fine, but that's not what we need to test for at all
>> > because it might not be guaranteed by all the file systems. So I
>> > think this last part of the test is rather pointless. So I think
>> > this patch is not needed as well.
>> >
>> > Or am I missing something Filipe ?
>>
>> So that second part of the test, essentially comes from this ordered
>> metadata dependency explanation Dave gave me recently on another
>> thread:
>>
>> http://www.spinics.net/lists/linux-btrfs/msg42086.html
>
> It's interesting, but it really applies only to metadata updates
> since really we normally only journal metadata. We do not
> consider extended attributes to be metadata, do we ?
>
>>
>> Yes, I'm considering xattrs as metadata (even though they can be seen
>> as data as well). This behaviour I'm testing for applies to ext3/4 and
>> xfs for example (and apparently intentional, since the test passes on
>> these filesystems).
>
> Ok, I am confused. Clearly ext4, nor xfs consider xattrs metadata
> which can be tested simply by attaching xattr and crashing the file
> system immediately afterwards - the new xattr will not be there -
> that's expected for data, but unexpected for metadata.

Hum, my testing (on a 3.19 kernel) for both ext4 and xfs shows me
something different.
Perhaps we're testing differently.

So I just replaced the the line that adds a hard link to our inode with:

$SETFATTR_PROG -n user.attr4 -v val4 $SCRATCH_MNT/foobar

Then after the crash + mount, the new xattr is listed (with the
correct value val4).
I also tried the variants of leaving the 'ln' command and adding the
xattr either right before or right after the 'ln' command. For both
cases, the new xattr was there after crash + mount.

How did you test?

>
> Now the fact that it works might be just a coincidence. Btw in the
> discussion Dave never mentioned xattr, he only talks about inode
> size and extent list changes which makes sense since those are
> metadata and it's expected to be "stabilised" as he very well
> described. I just do not think this applies to this case.

Correct, I assumed his explanation applied to metadata in general and
I'm considering xattrs as metadata (though as I said before, it's not
unreasonable to consider them as data as well imho).

>
> Also I think that his wording that fsync on the file implies fsync
> on the directory is unfortunate because it does not. However it
> implies that the directory will actually be stabilised as well due
> to journalling. But the results are the same.
>
> Anyway, maybe Dave can sprinkle some of his wisdom over this...

Yes :)

Thanks

>
> Thanks!
> -Lukas
>
>>
>> Thanks
>>
>> >
>> > Thanks!
>> > -Lukas
>> >
>> >> Otherwise, file A is recovered to #3.
>> >
>> >
>> >>
>> >> Cc: Filipe Manana <fdmanana@suse.com>
>> >> Cc: Eric Sandeen <sandeen@redhat.com>
>> >> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>> >> ---
>> >>  common/rc         | 18 ++++++++++++++++++
>> >>  tests/generic/066 |  2 +-
>> >>  2 files changed, 19 insertions(+), 1 deletion(-)
>> >>
>> >> diff --git a/common/rc b/common/rc
>> >> index 1ed9df5..e6e8d1f 100644
>> >> --- a/common/rc
>> >> +++ b/common/rc
>> >> @@ -2372,6 +2372,24 @@ _require_metadata_journaling()
>> >>       esac
>> >>  }
>> >>
>> >> +# Does this filesystem support metadata replay?
>> >> +# Filesystem is able to recover metadata which were not written by fsync
>> >> +# exlicitly. But another fsync'ed metadata should be followed by them.
>> >> +_require_metadata_replay()
>> >> +{
>> >> +     _require_metadata_journaling $1
>> >> +
>> >> +     case "$FSTYP" in
>> >> +     f2fs)
>> >> +             # f2fs supports metadata_journaling, but does not recover any
>> >> +             # intermediate metadata which was not fsync'ed explicitly.
>> >> +             _notrun "$FSTYP does not support metadata replay"
>> >> +             ;;
>> >> +     *)
>> >> +             ;;
>> >> +     esac
>> >> +}
>> >> +
>> >>  # Does fiemap support?
>> >>  _require_fiemap()
>> >>  {
>> >> diff --git a/tests/generic/066 b/tests/generic/066
>> >> index cb36506..3fefda4 100755
>> >> --- a/tests/generic/066
>> >> +++ b/tests/generic/066
>> >> @@ -61,7 +61,7 @@ _need_to_be_root
>> >>  _require_scratch
>> >>  _require_dm_flakey
>> >>  _require_attrs
>> >> -_require_metadata_journaling $SCRATCH_DEV
>> >> +_require_metadata_replay $SCRATCH_DEV
>> >>
>> >>  _crash_and_mount()
>> >>  {
>> >>
>> >
>> > ------------------------------------------------------------------------------
>> > Dive into the World of Parallel Programming The Go Parallel Website, sponsored
>> > by Intel and developed in partnership with Slashdot Media, is your hub for all
>> > things parallel software development, from weekly thought leadership blogs to
>> > news, videos, case studies, tutorials and more. Take a look and join the
>> > conversation now. http://goparallel.sourceforge.net/
>> > _______________________________________________
>> > Linux-f2fs-devel mailing list
>> > Linux-f2fs-devel@lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>
>>
>>
>>
Lukas Czerner Feb. 27, 2015, 3:05 p.m. UTC | #5
On Fri, 27 Feb 2015, Filipe David Manana wrote:

> Date: Fri, 27 Feb 2015 14:34:35 +0000
> From: Filipe David Manana <fdmanana@gmail.com>
> To: Lukáš Czerner <lczerner@redhat.com>
> Cc: Jaegeuk Kim <jaegeuk@kernel.org>, Filipe Manana <fdmanana@suse.com>,
>     Eric Sandeen <sandeen@redhat.com>, Dave Chinner <david@fromorbit.com>,
>     fstests@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net
> Subject: Re: [f2fs-dev] [PATCH 2/2] generic/066: add _require_metadata_replay
> 
> On Fri, Feb 27, 2015 at 1:10 PM, Lukáš Czerner <lczerner@redhat.com> wrote:
> > On Fri, 27 Feb 2015, Filipe David Manana wrote:
> >
> >> Date: Fri, 27 Feb 2015 11:43:36 +0000
> >> From: Filipe David Manana <fdmanana@gmail.com>
> >> To: Lukáš Czerner <lczerner@redhat.com>
> >> Cc: Jaegeuk Kim <jaegeuk@kernel.org>, Filipe Manana <fdmanana@suse.com>,
> >>     Eric Sandeen <sandeen@redhat.com>, Dave Chinner <david@fromorbit.com>,
> >>     fstests@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net
> >> Subject: Re: [f2fs-dev] [PATCH 2/2] generic/066: add _require_metadata_replay
> >>
> >> On Fri, Feb 27, 2015 at 11:34 AM, Lukáš Czerner <lczerner@redhat.com> wrote:
> >> > On Thu, 26 Feb 2015, Jaegeuk Kim wrote:
> >> >
> >> >> Date: Thu, 26 Feb 2015 17:23:47 -0800
> >> >> From: Jaegeuk Kim <jaegeuk@kernel.org>
> >> >> To: Dave Chinner <david@fromorbit.com>
> >> >> Cc: fstests@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net,
> >> >>     Jaegeuk Kim <jaegeuk@kernel.org>, Filipe Manana <fdmanana@suse.com>,
> >> >>     Eric Sandeen <sandeen@redhat.com>
> >> >> Subject: [PATCH 2/2] generic/066: add _require_metadata_replay
> >> >>
> >> >> This patch adds _require_metadata_replay to detect whether or not filesystem
> >> >> supports metadata replay.
> >> >>
> >> >> This should be used when:
> >> >>
> >> >> 1. create file A
> >> >> 2. write file A
> >> >> 3. fsync file A
> >> >>
> >> >> 4. write file A
> >> >>
> >> >> 5. create file B
> >> >> 6. fsync file B
> >> >> 7. crash!
> >> >>
> >> >> In this case, if filesystem supports metadata_replay, file A's data written
> >> >> by #4 should be recovered.
> >> >
> >> > What ? No it should not! file A was never fsync after the last
> >> > write, so there is no guarantee that we'll have the new content.
> >> > "Metadata replay" suggests that we only replay metadata, not data
> >> > and that's what file systems usually do with they journals, cows,
> >> > and so on...
> >> >
> >> > But this test is not about writing to a file, it's about changing
> >> > file xattr. And while extended attributes are metadata, they are
> >> > _not_ file system metadata and as such I think it can be very well
> >> > treated as data. Hence explicit fsync is needed to make sure it's
> >> > written to the disk.
> >> >
> >> > Now let's look at the test itself:
> >> >
> >> >
> >> >         echo "hello world!" >> $SCRATCH_MNT/foobar
> >> >         $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar
> >> >         $SETFATTR_PROG -x user.attr1 $SCRATCH_MNT/foobar
> >> >         ln $SCRATCH_MNT/foobar $SCRATCH_MNT/foobar_link
> >> >         touch $SCRATCH_MNT/qwerty
> >> >         $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/qwerty
> >> >
> >> >         _crash_and_mount
> >> >
> >> >         # Now only the xattr with name user.attr3 should be set in our file.
> >> >         echo "xattr names and values after second fsync log replay:"
> >> >         $GETFATTR_PROG --absolute-names --dump $SCRATCH_MNT/foobar |
> >> >         _filter_scratch
> >> >
> >> > No it should not have only attr3 because $SCRATCH_MNT/foobar was
> >> > never fsynced after attr3 removal. If the internal design of the
> >> > file system accidentally synces unrelated xattrs on unrelated fsync,
> >> > than that's fine, but that's not what we need to test for at all
> >> > because it might not be guaranteed by all the file systems. So I
> >> > think this last part of the test is rather pointless. So I think
> >> > this patch is not needed as well.
> >> >
> >> > Or am I missing something Filipe ?
> >>
> >> So that second part of the test, essentially comes from this ordered
> >> metadata dependency explanation Dave gave me recently on another
> >> thread:
> >>
> >> http://www.spinics.net/lists/linux-btrfs/msg42086.html
> >
> > It's interesting, but it really applies only to metadata updates
> > since really we normally only journal metadata. We do not
> > consider extended attributes to be metadata, do we ?
> >
> >>
> >> Yes, I'm considering xattrs as metadata (even though they can be seen
> >> as data as well). This behaviour I'm testing for applies to ext3/4 and
> >> xfs for example (and apparently intentional, since the test passes on
> >> these filesystems).
> >
> > Ok, I am confused. Clearly ext4, nor xfs consider xattrs metadata
> > which can be tested simply by attaching xattr and crashing the file
> > system immediately afterwards - the new xattr will not be there -
> > that's expected for data, but unexpected for metadata.
> 
> Hum, my testing (on a 3.19 kernel) for both ext4 and xfs shows me
> something different.
> Perhaps we're testing differently.
> 
> So I just replaced the the line that adds a hard link to our inode with:
> 
> $SETFATTR_PROG -n user.attr4 -v val4 $SCRATCH_MNT/foobar
> 
> Then after the crash + mount, the new xattr is listed (with the
> correct value val4).
> I also tried the variants of leaving the 'ln' command and adding the
> xattr either right before or right after the 'ln' command. For both
> cases, the new xattr was there after crash + mount.
> 
> How did you test?

Simply created file, fsync, added xattr and crash. But I think I am
confusing consistency vs. persistence again :(

> 
> >
> > Now the fact that it works might be just a coincidence. Btw in the
> > discussion Dave never mentioned xattr, he only talks about inode
> > size and extent list changes which makes sense since those are
> > metadata and it's expected to be "stabilised" as he very well
> > described. I just do not think this applies to this case.
> 
> Correct, I assumed his explanation applied to metadata in general and
> I'm considering xattrs as metadata (though as I said before, it's not
> unreasonable to consider them as data as well imho).

Well that's the thing I am not really sure about. Can not find any
documentation about this anywhere at all.

-Lukas

> 
> >
> > Also I think that his wording that fsync on the file implies fsync
> > on the directory is unfortunate because it does not. However it
> > implies that the directory will actually be stabilised as well due
> > to journalling. But the results are the same.
> >
> > Anyway, maybe Dave can sprinkle some of his wisdom over this...
> 
> Yes :)
> 
> Thanks
> 
> >
> > Thanks!
> > -Lukas
> >
> >>
> >> Thanks
> >>
> >> >
> >> > Thanks!
> >> > -Lukas
> >> >
> >> >> Otherwise, file A is recovered to #3.
> >> >
> >> >
> >> >>
> >> >> Cc: Filipe Manana <fdmanana@suse.com>
> >> >> Cc: Eric Sandeen <sandeen@redhat.com>
> >> >> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >> >> ---
> >> >>  common/rc         | 18 ++++++++++++++++++
> >> >>  tests/generic/066 |  2 +-
> >> >>  2 files changed, 19 insertions(+), 1 deletion(-)
> >> >>
> >> >> diff --git a/common/rc b/common/rc
> >> >> index 1ed9df5..e6e8d1f 100644
> >> >> --- a/common/rc
> >> >> +++ b/common/rc
> >> >> @@ -2372,6 +2372,24 @@ _require_metadata_journaling()
> >> >>       esac
> >> >>  }
> >> >>
> >> >> +# Does this filesystem support metadata replay?
> >> >> +# Filesystem is able to recover metadata which were not written by fsync
> >> >> +# exlicitly. But another fsync'ed metadata should be followed by them.
> >> >> +_require_metadata_replay()
> >> >> +{
> >> >> +     _require_metadata_journaling $1
> >> >> +
> >> >> +     case "$FSTYP" in
> >> >> +     f2fs)
> >> >> +             # f2fs supports metadata_journaling, but does not recover any
> >> >> +             # intermediate metadata which was not fsync'ed explicitly.
> >> >> +             _notrun "$FSTYP does not support metadata replay"
> >> >> +             ;;
> >> >> +     *)
> >> >> +             ;;
> >> >> +     esac
> >> >> +}
> >> >> +
> >> >>  # Does fiemap support?
> >> >>  _require_fiemap()
> >> >>  {
> >> >> diff --git a/tests/generic/066 b/tests/generic/066
> >> >> index cb36506..3fefda4 100755
> >> >> --- a/tests/generic/066
> >> >> +++ b/tests/generic/066
> >> >> @@ -61,7 +61,7 @@ _need_to_be_root
> >> >>  _require_scratch
> >> >>  _require_dm_flakey
> >> >>  _require_attrs
> >> >> -_require_metadata_journaling $SCRATCH_DEV
> >> >> +_require_metadata_replay $SCRATCH_DEV
> >> >>
> >> >>  _crash_and_mount()
> >> >>  {
> >> >>
> >> >
> >> > ------------------------------------------------------------------------------
> >> > Dive into the World of Parallel Programming The Go Parallel Website, sponsored
> >> > by Intel and developed in partnership with Slashdot Media, is your hub for all
> >> > things parallel software development, from weekly thought leadership blogs to
> >> > news, videos, case studies, tutorials and more. Take a look and join the
> >> > conversation now. http://goparallel.sourceforge.net/
> >> > _______________________________________________
> >> > Linux-f2fs-devel mailing list
> >> > Linux-f2fs-devel@lists.sourceforge.net
> >> > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> >>
> >>
> >>
> >>
> 
> 
> 
>
Filipe Manana Feb. 27, 2015, 3:09 p.m. UTC | #6
On Fri, Feb 27, 2015 at 3:05 PM, Lukáš Czerner <lczerner@redhat.com> wrote:
> On Fri, 27 Feb 2015, Filipe David Manana wrote:
>
>> Date: Fri, 27 Feb 2015 14:34:35 +0000
>> From: Filipe David Manana <fdmanana@gmail.com>
>> To: Lukáš Czerner <lczerner@redhat.com>
>> Cc: Jaegeuk Kim <jaegeuk@kernel.org>, Filipe Manana <fdmanana@suse.com>,
>>     Eric Sandeen <sandeen@redhat.com>, Dave Chinner <david@fromorbit.com>,
>>     fstests@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net
>> Subject: Re: [f2fs-dev] [PATCH 2/2] generic/066: add _require_metadata_replay
>>
>> On Fri, Feb 27, 2015 at 1:10 PM, Lukáš Czerner <lczerner@redhat.com> wrote:
>> > On Fri, 27 Feb 2015, Filipe David Manana wrote:
>> >
>> >> Date: Fri, 27 Feb 2015 11:43:36 +0000
>> >> From: Filipe David Manana <fdmanana@gmail.com>
>> >> To: Lukáš Czerner <lczerner@redhat.com>
>> >> Cc: Jaegeuk Kim <jaegeuk@kernel.org>, Filipe Manana <fdmanana@suse.com>,
>> >>     Eric Sandeen <sandeen@redhat.com>, Dave Chinner <david@fromorbit.com>,
>> >>     fstests@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net
>> >> Subject: Re: [f2fs-dev] [PATCH 2/2] generic/066: add _require_metadata_replay
>> >>
>> >> On Fri, Feb 27, 2015 at 11:34 AM, Lukáš Czerner <lczerner@redhat.com> wrote:
>> >> > On Thu, 26 Feb 2015, Jaegeuk Kim wrote:
>> >> >
>> >> >> Date: Thu, 26 Feb 2015 17:23:47 -0800
>> >> >> From: Jaegeuk Kim <jaegeuk@kernel.org>
>> >> >> To: Dave Chinner <david@fromorbit.com>
>> >> >> Cc: fstests@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net,
>> >> >>     Jaegeuk Kim <jaegeuk@kernel.org>, Filipe Manana <fdmanana@suse.com>,
>> >> >>     Eric Sandeen <sandeen@redhat.com>
>> >> >> Subject: [PATCH 2/2] generic/066: add _require_metadata_replay
>> >> >>
>> >> >> This patch adds _require_metadata_replay to detect whether or not filesystem
>> >> >> supports metadata replay.
>> >> >>
>> >> >> This should be used when:
>> >> >>
>> >> >> 1. create file A
>> >> >> 2. write file A
>> >> >> 3. fsync file A
>> >> >>
>> >> >> 4. write file A
>> >> >>
>> >> >> 5. create file B
>> >> >> 6. fsync file B
>> >> >> 7. crash!
>> >> >>
>> >> >> In this case, if filesystem supports metadata_replay, file A's data written
>> >> >> by #4 should be recovered.
>> >> >
>> >> > What ? No it should not! file A was never fsync after the last
>> >> > write, so there is no guarantee that we'll have the new content.
>> >> > "Metadata replay" suggests that we only replay metadata, not data
>> >> > and that's what file systems usually do with they journals, cows,
>> >> > and so on...
>> >> >
>> >> > But this test is not about writing to a file, it's about changing
>> >> > file xattr. And while extended attributes are metadata, they are
>> >> > _not_ file system metadata and as such I think it can be very well
>> >> > treated as data. Hence explicit fsync is needed to make sure it's
>> >> > written to the disk.
>> >> >
>> >> > Now let's look at the test itself:
>> >> >
>> >> >
>> >> >         echo "hello world!" >> $SCRATCH_MNT/foobar
>> >> >         $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar
>> >> >         $SETFATTR_PROG -x user.attr1 $SCRATCH_MNT/foobar
>> >> >         ln $SCRATCH_MNT/foobar $SCRATCH_MNT/foobar_link
>> >> >         touch $SCRATCH_MNT/qwerty
>> >> >         $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/qwerty
>> >> >
>> >> >         _crash_and_mount
>> >> >
>> >> >         # Now only the xattr with name user.attr3 should be set in our file.
>> >> >         echo "xattr names and values after second fsync log replay:"
>> >> >         $GETFATTR_PROG --absolute-names --dump $SCRATCH_MNT/foobar |
>> >> >         _filter_scratch
>> >> >
>> >> > No it should not have only attr3 because $SCRATCH_MNT/foobar was
>> >> > never fsynced after attr3 removal. If the internal design of the
>> >> > file system accidentally synces unrelated xattrs on unrelated fsync,
>> >> > than that's fine, but that's not what we need to test for at all
>> >> > because it might not be guaranteed by all the file systems. So I
>> >> > think this last part of the test is rather pointless. So I think
>> >> > this patch is not needed as well.
>> >> >
>> >> > Or am I missing something Filipe ?
>> >>
>> >> So that second part of the test, essentially comes from this ordered
>> >> metadata dependency explanation Dave gave me recently on another
>> >> thread:
>> >>
>> >> http://www.spinics.net/lists/linux-btrfs/msg42086.html
>> >
>> > It's interesting, but it really applies only to metadata updates
>> > since really we normally only journal metadata. We do not
>> > consider extended attributes to be metadata, do we ?
>> >
>> >>
>> >> Yes, I'm considering xattrs as metadata (even though they can be seen
>> >> as data as well). This behaviour I'm testing for applies to ext3/4 and
>> >> xfs for example (and apparently intentional, since the test passes on
>> >> these filesystems).
>> >
>> > Ok, I am confused. Clearly ext4, nor xfs consider xattrs metadata
>> > which can be tested simply by attaching xattr and crashing the file
>> > system immediately afterwards - the new xattr will not be there -
>> > that's expected for data, but unexpected for metadata.
>>
>> Hum, my testing (on a 3.19 kernel) for both ext4 and xfs shows me
>> something different.
>> Perhaps we're testing differently.
>>
>> So I just replaced the the line that adds a hard link to our inode with:
>>
>> $SETFATTR_PROG -n user.attr4 -v val4 $SCRATCH_MNT/foobar
>>
>> Then after the crash + mount, the new xattr is listed (with the
>> correct value val4).
>> I also tried the variants of leaving the 'ln' command and adding the
>> xattr either right before or right after the 'ln' command. For both
>> cases, the new xattr was there after crash + mount.
>>
>> How did you test?
>
> Simply created file, fsync, added xattr and crash. But I think I am
> confusing consistency vs. persistence again :(

Ah in that case it's normal. You need to be sure the journal/log is
persisted by fsyncing the other file (named 'qwerty').
For the xattr removal case, if you don't fsync 'qwerty', the xattr
'user.attr1' will still be there for file 'foobar' after crash +
mount.

>
>>
>> >
>> > Now the fact that it works might be just a coincidence. Btw in the
>> > discussion Dave never mentioned xattr, he only talks about inode
>> > size and extent list changes which makes sense since those are
>> > metadata and it's expected to be "stabilised" as he very well
>> > described. I just do not think this applies to this case.
>>
>> Correct, I assumed his explanation applied to metadata in general and
>> I'm considering xattrs as metadata (though as I said before, it's not
>> unreasonable to consider them as data as well imho).
>
> Well that's the thing I am not really sure about. Can not find any
> documentation about this anywhere at all.
>
> -Lukas
>
>>
>> >
>> > Also I think that his wording that fsync on the file implies fsync
>> > on the directory is unfortunate because it does not. However it
>> > implies that the directory will actually be stabilised as well due
>> > to journalling. But the results are the same.
>> >
>> > Anyway, maybe Dave can sprinkle some of his wisdom over this...
>>
>> Yes :)
>>
>> Thanks
>>
>> >
>> > Thanks!
>> > -Lukas
>> >
>> >>
>> >> Thanks
>> >>
>> >> >
>> >> > Thanks!
>> >> > -Lukas
>> >> >
>> >> >> Otherwise, file A is recovered to #3.
>> >> >
>> >> >
>> >> >>
>> >> >> Cc: Filipe Manana <fdmanana@suse.com>
>> >> >> Cc: Eric Sandeen <sandeen@redhat.com>
>> >> >> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>> >> >> ---
>> >> >>  common/rc         | 18 ++++++++++++++++++
>> >> >>  tests/generic/066 |  2 +-
>> >> >>  2 files changed, 19 insertions(+), 1 deletion(-)
>> >> >>
>> >> >> diff --git a/common/rc b/common/rc
>> >> >> index 1ed9df5..e6e8d1f 100644
>> >> >> --- a/common/rc
>> >> >> +++ b/common/rc
>> >> >> @@ -2372,6 +2372,24 @@ _require_metadata_journaling()
>> >> >>       esac
>> >> >>  }
>> >> >>
>> >> >> +# Does this filesystem support metadata replay?
>> >> >> +# Filesystem is able to recover metadata which were not written by fsync
>> >> >> +# exlicitly. But another fsync'ed metadata should be followed by them.
>> >> >> +_require_metadata_replay()
>> >> >> +{
>> >> >> +     _require_metadata_journaling $1
>> >> >> +
>> >> >> +     case "$FSTYP" in
>> >> >> +     f2fs)
>> >> >> +             # f2fs supports metadata_journaling, but does not recover any
>> >> >> +             # intermediate metadata which was not fsync'ed explicitly.
>> >> >> +             _notrun "$FSTYP does not support metadata replay"
>> >> >> +             ;;
>> >> >> +     *)
>> >> >> +             ;;
>> >> >> +     esac
>> >> >> +}
>> >> >> +
>> >> >>  # Does fiemap support?
>> >> >>  _require_fiemap()
>> >> >>  {
>> >> >> diff --git a/tests/generic/066 b/tests/generic/066
>> >> >> index cb36506..3fefda4 100755
>> >> >> --- a/tests/generic/066
>> >> >> +++ b/tests/generic/066
>> >> >> @@ -61,7 +61,7 @@ _need_to_be_root
>> >> >>  _require_scratch
>> >> >>  _require_dm_flakey
>> >> >>  _require_attrs
>> >> >> -_require_metadata_journaling $SCRATCH_DEV
>> >> >> +_require_metadata_replay $SCRATCH_DEV
>> >> >>
>> >> >>  _crash_and_mount()
>> >> >>  {
>> >> >>
>> >> >
>> >> > ------------------------------------------------------------------------------
>> >> > Dive into the World of Parallel Programming The Go Parallel Website, sponsored
>> >> > by Intel and developed in partnership with Slashdot Media, is your hub for all
>> >> > things parallel software development, from weekly thought leadership blogs to
>> >> > news, videos, case studies, tutorials and more. Take a look and join the
>> >> > conversation now. http://goparallel.sourceforge.net/
>> >> > _______________________________________________
>> >> > Linux-f2fs-devel mailing list
>> >> > Linux-f2fs-devel@lists.sourceforge.net
>> >> > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>> >>
>> >>
>> >>
>> >>
>>
>>
>>
>>
Jaegeuk Kim Feb. 27, 2015, 7:02 p.m. UTC | #7
Hi Lukas,

On Fri, Feb 27, 2015 at 12:34:09PM +0100, Lukáš Czerner wrote:
> On Thu, 26 Feb 2015, Jaegeuk Kim wrote:
> 
> > Date: Thu, 26 Feb 2015 17:23:47 -0800
> > From: Jaegeuk Kim <jaegeuk@kernel.org>
> > To: Dave Chinner <david@fromorbit.com>
> > Cc: fstests@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net,
> >     Jaegeuk Kim <jaegeuk@kernel.org>, Filipe Manana <fdmanana@suse.com>,
> >     Eric Sandeen <sandeen@redhat.com>
> > Subject: [PATCH 2/2] generic/066: add _require_metadata_replay
> > 
> > This patch adds _require_metadata_replay to detect whether or not filesystem
> > supports metadata replay.
> > 
> > This should be used when:
> > 
> > 1. create file A
> > 2. write file A
> > 3. fsync file A
> > 
> > 4. write file A
> > 
> > 5. create file B
> > 6. fsync file B
> > 7. crash!
> > 
> > In this case, if filesystem supports metadata_replay, file A's data written
> > by #4 should be recovered.
> 
> What ? No it should not! file A was never fsync after the last
> write, so there is no guarantee that we'll have the new content.
> "Metadata replay" suggests that we only replay metadata, not data
> and that's what file systems usually do with they journals, cows,
> and so on...

I totally wrote this description insanely.
My apologies.

Yes, what I wanted to point out was the coverage of xattr recovery, not data.
1. create file A
2. write file A
3. fsync file A

*4. setxattr file A

5. create file B
6. fsync file B
7. crash!

What I've focussed on actually was the coverage of metadata replay which depends
on filesystem's design.

In this case, I think the metadata log contains:
...
1. fsync'ed file A
2. changed xattrs on file A
3. fsync'ed file B

Currently, xfs and ext4 recover #2, and btrfs without Filipe's patch does not.
And, I understood that this testcase intends to check #2 is recoverable or not.

IMHO, however, basically filesystem doesn't need to recover #2 at all.
So, I tried to add _require_something to check whether filesystem recovers
written xattrs followed by any fsync'ed file coincidentally.

Thanks,

>
> 
> But this test is not about writing to a file, it's about changing
> file xattr. And while extended attributes are metadata, they are
> _not_ file system metadata and as such I think it can be very well
> treated as data. Hence explicit fsync is needed to make sure it's
> written to the disk.
> 
> Now let's look at the test itself:
> 
> 
> 	echo "hello world!" >> $SCRATCH_MNT/foobar
> 	$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar
> 	$SETFATTR_PROG -x user.attr1 $SCRATCH_MNT/foobar
> 	ln $SCRATCH_MNT/foobar $SCRATCH_MNT/foobar_link
> 	touch $SCRATCH_MNT/qwerty
> 	$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/qwerty
> 
> 	_crash_and_mount
> 
> 	# Now only the xattr with name user.attr3 should be set in our file.
> 	echo "xattr names and values after second fsync log replay:"
> 	$GETFATTR_PROG --absolute-names --dump $SCRATCH_MNT/foobar |
> 	_filter_scratch
> 
> No it should not have only attr3 because $SCRATCH_MNT/foobar was
> never fsynced after attr3 removal. If the internal design of the
> file system accidentally synces unrelated xattrs on unrelated fsync,
> than that's fine, but that's not what we need to test for at all
> because it might not be guaranteed by all the file systems. So I
> think this last part of the test is rather pointless. So I think
> this patch is not needed as well.
> 
> Or am I missing something Filipe ?
> 
> Thanks!
> -Lukas
> 
> > Otherwise, file A is recovered to #3.
> 
> 
> > 
> > Cc: Filipe Manana <fdmanana@suse.com>
> > Cc: Eric Sandeen <sandeen@redhat.com>
> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> > ---
> >  common/rc         | 18 ++++++++++++++++++
> >  tests/generic/066 |  2 +-
> >  2 files changed, 19 insertions(+), 1 deletion(-)
> > 
> > diff --git a/common/rc b/common/rc
> > index 1ed9df5..e6e8d1f 100644
> > --- a/common/rc
> > +++ b/common/rc
> > @@ -2372,6 +2372,24 @@ _require_metadata_journaling()
> >  	esac
> >  }
> >  
> > +# Does this filesystem support metadata replay?
> > +# Filesystem is able to recover metadata which were not written by fsync
> > +# exlicitly. But another fsync'ed metadata should be followed by them.
> > +_require_metadata_replay()
> > +{
> > +	_require_metadata_journaling $1
> > +
> > +	case "$FSTYP" in
> > +	f2fs)
> > +		# f2fs supports metadata_journaling, but does not recover any
> > +		# intermediate metadata which was not fsync'ed explicitly.
> > +		_notrun "$FSTYP does not support metadata replay"
> > +		;;
> > +	*)
> > +		;;
> > +	esac
> > +}
> > +
> >  # Does fiemap support?
> >  _require_fiemap()
> >  {
> > diff --git a/tests/generic/066 b/tests/generic/066
> > index cb36506..3fefda4 100755
> > --- a/tests/generic/066
> > +++ b/tests/generic/066
> > @@ -61,7 +61,7 @@ _need_to_be_root
> >  _require_scratch
> >  _require_dm_flakey
> >  _require_attrs
> > -_require_metadata_journaling $SCRATCH_DEV
> > +_require_metadata_replay $SCRATCH_DEV
> >  
> >  _crash_and_mount()
> >  {
> > 
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dave Chinner March 18, 2015, 3:33 a.m. UTC | #8
On Fri, Feb 27, 2015 at 02:10:55PM +0100, Lukáš Czerner wrote:
> It's interesting, but it really applies only to metadata updates
> since really we normally only journal metadata. We do not
> consider extended attributes to be metadata, do we ?

Just to close the circle here, seeing as I don't think this was
answered: XFS considers all xattrs as metadata.

> > Yes, I'm considering xattrs as metadata (even though they can be seen
> > as data as well). This behaviour I'm testing for applies to ext3/4 and
> > xfs for example (and apparently intentional, since the test passes on
> > these filesystems).
> 
> Ok, I am confused. Clearly ext4, nor xfs consider xattrs metadata
> which can be tested simply by attaching xattr and crashing the file
> system immediately afterwards - the new xattr will not be there -
> that's expected for data, but unexpected for metadata.

It is expected of metadata if there was no fsync.

> Now the fact that it works might be just a coincidence. Btw in the
> discussion Dave never mentioned xattr, he only talks about inode
> size and extent list changes which makes sense since those are
> metadata and it's expected to be "stabilised" as he very well
> described. I just do not think this applies to this case.

xattrs are part of the journalled inode metadata in XFS, just like
the size and data extent tree.

> Also I think that his wording that fsync on the file implies fsync
> on the directory is unfortunate because it does not.

POSIX does not define how file/directory synchronisation should
work - it allows fsync() to be a complete no-op, so we are really on
our own here. i.e. we define the behaviour ourselves.

> However it
> implies that the directory will actually be stabilised as well due
> to journalling. But the results are the same.

Exactly - what I've described previously is based on the
transactional model that ext4, XFS and btrfs use - they all use a
strongly ordered atomic transaction model. That is, if we commit
transaction N to stable storage, we also commit N-1, N-2, ... and
N-m. i.e. we commit everything from the last synchronisation point
up to the current sync target.

That gives quite clear dependency rules to fsync. e.g:

	create file "X" in dir "Y" (tx N)
	write 1 byte to X	   (tx N+1)
	fsync X			   (force out tx N, N+1)

When fsync completes, we are guaranteeing that the application will
be able to find the byte we wrote to X. That also implies that
directory Y has a dirent that points to X, and that X has a file
size of 1 and and extent that points to the allocated block.

i.e. fsync() implies that all metadata needed to reference the data
that has been synced is present on disk. that means "fsync X" also
implies "fsync Y" because Y is the only way of finding X. However,
if we do this:

	create file "X" in dir "Y" (tx N)
	write 1 byte to X	   (tx N+1)
	add xattr to Y		   (tx N+2)
	fsync X			   (force out tx N, N+1)

the fsync of X is not guaranteed to stabilise "xattr Y" because that
change occurred *after* the dependency between X and Y was created
and is not required to be synced to resolve the dependency between X
and Y...

The devil is in the detail, but we really should see XFS, ext4 and
btrfs all provide the same fsync behaviour w.r.t. metadata and
fsync. Consistency is data integrity behaviour across different
filesystems is a good thing. :)

Cheers,

Dave.
diff mbox

Patch

diff --git a/common/rc b/common/rc
index 1ed9df5..e6e8d1f 100644
--- a/common/rc
+++ b/common/rc
@@ -2372,6 +2372,24 @@  _require_metadata_journaling()
 	esac
 }
 
+# Does this filesystem support metadata replay?
+# Filesystem is able to recover metadata which were not written by fsync
+# exlicitly. But another fsync'ed metadata should be followed by them.
+_require_metadata_replay()
+{
+	_require_metadata_journaling $1
+
+	case "$FSTYP" in
+	f2fs)
+		# f2fs supports metadata_journaling, but does not recover any
+		# intermediate metadata which was not fsync'ed explicitly.
+		_notrun "$FSTYP does not support metadata replay"
+		;;
+	*)
+		;;
+	esac
+}
+
 # Does fiemap support?
 _require_fiemap()
 {
diff --git a/tests/generic/066 b/tests/generic/066
index cb36506..3fefda4 100755
--- a/tests/generic/066
+++ b/tests/generic/066
@@ -61,7 +61,7 @@  _need_to_be_root
 _require_scratch
 _require_dm_flakey
 _require_attrs
-_require_metadata_journaling $SCRATCH_DEV
+_require_metadata_replay $SCRATCH_DEV
 
 _crash_and_mount()
 {