Message ID | e367054d-046d-d6e5-943c-7fbe10493029@sandeen.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 2017年12月25日 13:56, Eric Sandeen wrote: > On 12/24/17 9:28 PM, Chen Rong wrote: >> Hi, everyone: >> >> the issue as below: > First we need to look - what does the test do? > > # Test that if we rename a file, create a new file that has the old name of the > # other file and is a child of the same parent directory, fsync the new inode, > # power fail and mount the filesystem, we do not lose the first file and that > # file has the name it was renamed to. > > Ok, so it is a test of fsync'd file renames/creation over simulated device > failure, and wants to be sure that all files and file contents are as > expected if data integrity syscalls were made before the device failure. > >> # ./check generic/342 >> FSTYP -- f2fs >> PLATFORM -- Linux/x86_64 node01 4.15.0-rc4 >> MKFS_OPTIONS -- /dev/sde1 >> MOUNT_OPTIONS -- -o acl,user_xattr /dev/sde1 /tmp/test1 >> >> generic/342 - output mismatch (see /home/ubuntu/git/xfstests-dev/results//generic/342.out.bad) >> --- tests/generic/342.out 2017-12-13 13:47:20.144000000 -0500 >> +++ /home/ubuntu/git/xfstests-dev/results//generic/342.out.bad 2017-12-25 11:13:12.928000000 -0500 >> @@ -8,8 +8,7 @@ >> 48c940ba3b8671d3d6ea74e4ccad8ca3 SCRATCH_MNT/a/bar >> Directory a/ contents after log replay: >> SCRATCH_MNT/a: >> -bar > the test created foo, fsynced it, then renamed it to bar and created a > /new/ file also named foo, then fsynced the new file foo. > > Despite that 2nd fsync, the renamed file "bar" is not present in > your case. > > However, don't we really need to fsync the directory after > the rename and recreation to ensure persistence? > > iows: does this patch make the test pass on f2fs? It does > for me, and I think it's the only valid way to run the test: the patch works for me. but btrfs could pass the test without it, why so different? > diff --git a/tests/generic/342 b/tests/generic/342 > index ed64e05..6a9e5bd 100755 > --- a/tests/generic/342 > +++ b/tests/generic/342 > @@ -68,6 +68,7 @@ sync > mv $SCRATCH_MNT/a/foo $SCRATCH_MNT/a/bar > $XFS_IO_PROG -f -c "pwrite -S 0xba 0 16K" $SCRATCH_MNT/a/foo | _filter_xfs_io > $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/a/foo > +$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/a > > echo "File digests before log replay:" > md5sum $SCRATCH_MNT/a/foo | _filter_scratch > > without an fsync of the parent dir, I don't think we can > guarantee that filename changes or additions will persist > on every filesystem. > >> foo >> File digests after log replay: >> 9e5d56a1f9b2c93589f9d55480f971a1 SCRATCH_MNT/a/foo >> ... >> (Run 'diff -u tests/generic/342.out /home/ubuntu/git/xfstests-dev/results//generic/342.out.bad' to see the entire diff) > Generally best to do that suggested diff command to see if > there are any more pieces of changed/wrong test output. > >> Ran: generic/342 >> Failures: generic/342 >> Failed 1 of 1 tests >> >> Is it a problem with my computer or a already known issue with f2fs? > or a problem with the test itself ... cool! > > -Eric > >> thanks! >> >> -- >> To unsubscribe from this list: send the line "unsubscribe fstests" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/24/17 10:30 PM, Chen Rong wrote: > > > On 2017年12月25日 13:56, Eric Sandeen wrote: >> On 12/24/17 9:28 PM, Chen Rong wrote: >>> Hi, everyone: >>> >>> the issue as below: >> First we need to look - what does the test do? >> >> # Test that if we rename a file, create a new file that has the old name of the >> # other file and is a child of the same parent directory, fsync the new inode, >> # power fail and mount the filesystem, we do not lose the first file and that >> # file has the name it was renamed to. >> >> Ok, so it is a test of fsync'd file renames/creation over simulated device >> failure, and wants to be sure that all files and file contents are as >> expected if data integrity syscalls were made before the device failure. >> >>> # ./check generic/342 >>> FSTYP -- f2fs >>> PLATFORM -- Linux/x86_64 node01 4.15.0-rc4 >>> MKFS_OPTIONS -- /dev/sde1 >>> MOUNT_OPTIONS -- -o acl,user_xattr /dev/sde1 /tmp/test1 >>> >>> generic/342 - output mismatch (see /home/ubuntu/git/xfstests-dev/results//generic/342.out.bad) >>> --- tests/generic/342.out 2017-12-13 13:47:20.144000000 -0500 >>> +++ /home/ubuntu/git/xfstests-dev/results//generic/342.out.bad 2017-12-25 11:13:12.928000000 -0500 >>> @@ -8,8 +8,7 @@ >>> 48c940ba3b8671d3d6ea74e4ccad8ca3 SCRATCH_MNT/a/bar >>> Directory a/ contents after log replay: >>> SCRATCH_MNT/a: >>> -bar >> the test created foo, fsynced it, then renamed it to bar and created a >> /new/ file also named foo, then fsynced the new file foo. >> >> Despite that 2nd fsync, the renamed file "bar" is not present in >> your case. >> >> However, don't we really need to fsync the directory after >> the rename and recreation to ensure persistence? >> >> iows: does this patch make the test pass on f2fs? It does >> for me, and I think it's the only valid way to run the test: > the patch works for me. but btrfs could pass the test without it, why so different? Filesystems are free to do /more/ than the minimum required by posix - see ext4_sync_parent for example. Or xfs_finish_rename, for synchronous mounts: * If this is a synchronous mount, make sure that the rename transaction * goes to disk before returning to the user. */ if (tp->t_mountp->m_flags & (XFS_MOUNT_WSYNC|XFS_MOUNT_DIRSYNC)) xfs_trans_set_sync(tp); so behavior can be fs-dependent, or mount option dependent, etc. But to be portable, if an app wants directory changes to be persistent before proceeding, it must fsync the directory after making changes. I don't know about f2fs's design intent, whether it guarantees more than posix requires, but to guarantee that this test works on any posix fs, I think that directory fsync is needed. -Eric -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Dec 24, 2017 at 11:47:20PM -0800, Eric Sandeen wrote: > Filesystems are free to do /more/ than the minimum required by posix - > see ext4_sync_parent for example. Or xfs_finish_rename, for synchronous > mounts: > > * If this is a synchronous mount, make sure that the rename transaction > * goes to disk before returning to the user. > */ > if (tp->t_mountp->m_flags & (XFS_MOUNT_WSYNC|XFS_MOUNT_DIRSYNC)) > xfs_trans_set_sync(tp); > > so behavior can be fs-dependent, or mount option dependent, etc. > > But to be portable, if an app wants directory changes to be persistent > before proceeding, it must fsync the directory after making changes. > > I don't know about f2fs's design intent, whether it guarantees more > than posix requires, but to guarantee that this test works on any posix > fs, I think that directory fsync is needed. Agreed that this is a test bug, and we should add the fsync to the parent directory. It might also be a good idea for f2fs to do more, given that fsync is a slow enough operation that so long as you can make sure the fsync of the parent directory happens within the same atomic update as the child inode, you might as well give the more expansive guarantee. But obviously that's up to the f2fs developers to decide whether they want to do that work. Cheers, - Ted -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/25, Theodore Ts'o wrote: > On Sun, Dec 24, 2017 at 11:47:20PM -0800, Eric Sandeen wrote: > > Filesystems are free to do /more/ than the minimum required by posix - > > see ext4_sync_parent for example. Or xfs_finish_rename, for synchronous > > mounts: > > > > * If this is a synchronous mount, make sure that the rename transaction > > * goes to disk before returning to the user. > > */ > > if (tp->t_mountp->m_flags & (XFS_MOUNT_WSYNC|XFS_MOUNT_DIRSYNC)) > > xfs_trans_set_sync(tp); > > > > so behavior can be fs-dependent, or mount option dependent, etc. > > > > But to be portable, if an app wants directory changes to be persistent > > before proceeding, it must fsync the directory after making changes. > > > > I don't know about f2fs's design intent, whether it guarantees more > > than posix requires, but to guarantee that this test works on any posix > > fs, I think that directory fsync is needed. > > Agreed that this is a test bug, and we should add the fsync to the > parent directory. Agreed too. Or, how about using "-o dirsync"? > > It might also be a good idea for f2fs to do more, given that fsync is > a slow enough operation that so long as you can make sure the fsync of > the parent directory happens within the same atomic update as the > child inode, you might as well give the more expansive guarantee. But > obviously that's up to the f2fs developers to decide whether they want > to do that work. Indeed. Actually, since one of our goals was to reduce fsync latencies for Android, we decided to support posix in a minimum way. In order to avoid complex directory updates recursively, however, we allowed the fsync on directories to trigger checkpoint requiring many IO operations. IMHO, as a quick fix, it seems we need to handle MS_DIRSYNC likewse xfs. #define MS_DIRSYNC 128 /* Directory modifications are synchronous */ Thanks, > > Cheers, > > - Ted > > -- > To unsubscribe from this list: send the line "unsubscribe fstests" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2017/12/28 3:11, Jaegeuk Kim wrote: > On 12/25, Theodore Ts'o wrote: >> On Sun, Dec 24, 2017 at 11:47:20PM -0800, Eric Sandeen wrote: >>> Filesystems are free to do /more/ than the minimum required by posix - >>> see ext4_sync_parent for example. Or xfs_finish_rename, for synchronous >>> mounts: >>> >>> * If this is a synchronous mount, make sure that the rename transaction >>> * goes to disk before returning to the user. >>> */ >>> if (tp->t_mountp->m_flags & (XFS_MOUNT_WSYNC|XFS_MOUNT_DIRSYNC)) >>> xfs_trans_set_sync(tp); >>> >>> so behavior can be fs-dependent, or mount option dependent, etc. >>> >>> But to be portable, if an app wants directory changes to be persistent >>> before proceeding, it must fsync the directory after making changes. >>> >>> I don't know about f2fs's design intent, whether it guarantees more >>> than posix requires, but to guarantee that this test works on any posix >>> fs, I think that directory fsync is needed. >> >> Agreed that this is a test bug, and we should add the fsync to the >> parent directory. Agreeed. > > Agreed too. Or, how about using "-o dirsync"? IMO, according to comments of generic/342, it wants to test posix sematic that supported by current fs' fsync(), if we want to test dirsync mount option, that would be better in another testcase? > >> >> It might also be a good idea for f2fs to do more, given that fsync is >> a slow enough operation that so long as you can make sure the fsync of >> the parent directory happens within the same atomic update as the >> child inode, you might as well give the more expansive guarantee. But >> obviously that's up to the f2fs developers to decide whether they want >> to do that work. > > Indeed. Actually, since one of our goals was to reduce fsync latencies for > Android, we decided to support posix in a minimum way. In order to avoid Agreed, in order to maximize performance of fsync regular file, we'd better not change current sematic of f2fs' fsync so far, moreover, there is no such requirement from upper layer application. > complex directory updates recursively, however, we allowed the fsync on > directories to trigger checkpoint requiring many IO operations. Some applications use its own database software layer instead generic sqlite in android, and they use deletion mode in their database, so whenever they trigger database transaction, we need to do checkpoint. There is such scenario that, before checkpoint it comes huge number of node & meta updating, result in that checkpoint becomes very slow. Maybe we can consider to fsync checkpointed directory with the way fsyncing regular file rather than triggering checkpoint to improve its performance? > > IMHO, as a quick fix, it seems we need to handle MS_DIRSYNC likewse xfs. > #define MS_DIRSYNC 128 /* Directory modifications are synchronous */ We have supported that yet in commit b7e1d800031c ("f2fs: implement -o dirsync"). ;) Thanks, > > Thanks, > >> >> Cheers, >> >> - Ted >> >> -- >> To unsubscribe from this list: send the line "unsubscribe fstests" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe fstests" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > . > -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Dec 28, 2017 at 11:17:09AM +0800, Chao Yu wrote: > > Indeed. Actually, since one of our goals was to reduce fsync latencies for > > Android, we decided to support posix in a minimum way. In order to avoid > > Agreed, in order to maximize performance of fsync regular file, we'd better > not change current sematic of f2fs' fsync so far, moreover, there is no > such requirement from upper layer application. The cost of forcing an fsync on the parent directory only if the regular file was just created should really be minimal. Most of the performance critical use cases for fsync() is after writing data on an already-existing file. Even in the case where a package manager is writing, say, lots of newly created files followed by an fsync (such as a 21 megabyte docker binary) the overhead of including a handful of 4k metadata blocks for the containing directory to make sure /usr/bin/docker exists as part of the fsync() is in the noise compared to all of the other data blocks being written by the package manager. So I suspect the concerns of "oh, we can't provide the same guarantees as ext4 and f2fs" are a bit overblown here. But ultimately, that's up to the f2fs developers... > > IMHO, as a quick fix, it seems we need to handle MS_DIRSYNC likewse xfs. > > #define MS_DIRSYNC 128 /* Directory modifications are synchronous */ > > We have supported that yet in commit b7e1d800031c ("f2fs: implement -o > dirsync"). ;) To be clear, we're not talking about dirsync mode where *all* directory modifications are synchronous. What we're talking about here is the case where if you fsync() a regular file, and the regular file is newly created --- AND ONLY THAT CASE --- to make sure the parent directory is included in the set of metadata blocks updated by the fsync() against the regular file. Best regards, - Ted -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Dec 27, 2017 at 11:11:30AM -0800, Jaegeuk Kim wrote: > On 12/25, Theodore Ts'o wrote: > > On Sun, Dec 24, 2017 at 11:47:20PM -0800, Eric Sandeen wrote: > > > Filesystems are free to do /more/ than the minimum required by posix - > > > see ext4_sync_parent for example. Or xfs_finish_rename, for synchronous > > > mounts: > > > > > > * If this is a synchronous mount, make sure that the rename transaction > > > * goes to disk before returning to the user. > > > */ > > > if (tp->t_mountp->m_flags & (XFS_MOUNT_WSYNC|XFS_MOUNT_DIRSYNC)) > > > xfs_trans_set_sync(tp); > > > > > > so behavior can be fs-dependent, or mount option dependent, etc. > > > > > > But to be portable, if an app wants directory changes to be persistent > > > before proceeding, it must fsync the directory after making changes. > > > > > > I don't know about f2fs's design intent, whether it guarantees more > > > than posix requires, but to guarantee that this test works on any posix > > > fs, I think that directory fsync is needed. > > > > Agreed that this is a test bug, and we should add the fsync to the > > parent directory. > > Agreed too. Or, how about using "-o dirsync"? > > > > > It might also be a good idea for f2fs to do more, given that fsync is > > a slow enough operation that so long as you can make sure the fsync of > > the parent directory happens within the same atomic update as the > > child inode, you might as well give the more expansive guarantee. But > > obviously that's up to the f2fs developers to decide whether they want > > to do that work. > > Indeed. Actually, since one of our goals was to reduce fsync latencies for > Android, we decided to support posix in a minimum way. In order to avoid > complex directory updates recursively, however, we allowed the fsync on > directories to trigger checkpoint requiring many IO operations. So what you are really saying is that f2fs is not strictly ordered w.r.t metadata crash consistency after fsync()? Wasn't that considered a bug in btrfs that had to be fixed (and did get fixed)? Oh, yeah, it's right there in the test commit history: commit f02fe949113f35ae221ec1ab5c9959912f594bf4 Author: Filipe Manana <fdmanana@suse.com> Date: Tue Apr 5 11:47:55 2016 +1000 generic: add test for fsync after renaming file Test that if we rename a file, create a new file that has the old name of the other file and is a child of the same parent directory, fsync the new inode, power fail and mount the filesystem, we do not lose the first file and that file has the name it was renamed to. This test is motivated by an issue found in btrfs which is fixed by the following patch for the linux kernel: "Btrfs: fix file loss caused by fsync after rename and new inode" Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Eryu Guan <eguan@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com> There's a whole lot more detail in the kernel commit 2be63d5ce929 ("Btrfs: fix file loss on log replay after renaming a file and fsync") but my point is that we considered this a btrfs filesystem bug and so changing the test defeats it's purpose as a regression test for the btrfs bug. So IMO the test should not be changed. And I think we should be consistent and consider this f2fs failure as a f2fs bug that needs fixing to bring it's behaviour in line with xfs, ext4, and btrfs. Remember this when quoting POSIX about fsync behaviour: Posix is a terrible standard when it comes to data integrity. We go way, way beyond what POSIX specifies as a valid fsync implementation (i.e. posix allows "do nothing and return success" as a conformant implementation). Ext4, XFS and btrfs all have strictly ordered metadata crash recovery semantics and all of the crash recovery tests expect this behaviour from the filesytem being tested. The underlying intention is that by encoding it into these tests, all widely used and future linux filesystems meet or exceed this crash integrity requirement. IOWs, changing the test is the wrong thing to do on many levels.... Cheers, Dave.
On 12/28, Dave Chinner wrote: > On Wed, Dec 27, 2017 at 11:11:30AM -0800, Jaegeuk Kim wrote: > > On 12/25, Theodore Ts'o wrote: > > > On Sun, Dec 24, 2017 at 11:47:20PM -0800, Eric Sandeen wrote: > > > > Filesystems are free to do /more/ than the minimum required by posix - > > > > see ext4_sync_parent for example. Or xfs_finish_rename, for synchronous > > > > mounts: > > > > > > > > * If this is a synchronous mount, make sure that the rename transaction > > > > * goes to disk before returning to the user. > > > > */ > > > > if (tp->t_mountp->m_flags & (XFS_MOUNT_WSYNC|XFS_MOUNT_DIRSYNC)) > > > > xfs_trans_set_sync(tp); > > > > > > > > so behavior can be fs-dependent, or mount option dependent, etc. > > > > > > > > But to be portable, if an app wants directory changes to be persistent > > > > before proceeding, it must fsync the directory after making changes. > > > > > > > > I don't know about f2fs's design intent, whether it guarantees more > > > > than posix requires, but to guarantee that this test works on any posix > > > > fs, I think that directory fsync is needed. > > > > > > Agreed that this is a test bug, and we should add the fsync to the > > > parent directory. > > > > Agreed too. Or, how about using "-o dirsync"? > > > > > > > > It might also be a good idea for f2fs to do more, given that fsync is > > > a slow enough operation that so long as you can make sure the fsync of > > > the parent directory happens within the same atomic update as the > > > child inode, you might as well give the more expansive guarantee. But > > > obviously that's up to the f2fs developers to decide whether they want > > > to do that work. > > > > Indeed. Actually, since one of our goals was to reduce fsync latencies for > > Android, we decided to support posix in a minimum way. In order to avoid > > complex directory updates recursively, however, we allowed the fsync on > > directories to trigger checkpoint requiring many IO operations. > > So what you are really saying is that f2fs is not strictly ordered > w.r.t metadata crash consistency after fsync()? Wasn't that > considered a bug in btrfs that had to be fixed (and did get fixed)? > > Oh, yeah, it's right there in the test commit history: > > commit f02fe949113f35ae221ec1ab5c9959912f594bf4 > Author: Filipe Manana <fdmanana@suse.com> > Date: Tue Apr 5 11:47:55 2016 +1000 > > generic: add test for fsync after renaming file > > Test that if we rename a file, create a new file that has the old name > of the other file and is a child of the same parent directory, fsync the > new inode, power fail and mount the filesystem, we do not lose the first > file and that file has the name it was renamed to. > > This test is motivated by an issue found in btrfs which is fixed by the > following patch for the linux kernel: > > "Btrfs: fix file loss caused by fsync after rename and new inode" > > Signed-off-by: Filipe Manana <fdmanana@suse.com> > Reviewed-by: Eryu Guan <eguan@redhat.com> > Signed-off-by: Dave Chinner <david@fromorbit.com> > > There's a whole lot more detail in the kernel commit 2be63d5ce929 > ("Btrfs: fix file loss on log replay after renaming a file and > fsync") but my point is that we considered this a btrfs filesystem > bug and so changing the test defeats it's purpose as a regression > test for the btrfs bug. > > So IMO the test should not be changed. And I think we should be > consistent and consider this f2fs failure as a f2fs bug that needs > fixing to bring it's behaviour in line with xfs, ext4, and btrfs. > > Remember this when quoting POSIX about fsync behaviour: Posix is a > terrible standard when it comes to data integrity. We go way, way > beyond what POSIX specifies as a valid fsync implementation (i.e. > posix allows "do nothing and return success" as a conformant > implementation). Ext4, XFS and btrfs all have strictly ordered > metadata crash recovery semantics and all of the crash recovery > tests expect this behaviour from the filesytem being tested. The > underlying intention is that by encoding it into these tests, all > widely used and future linux filesystems meet or exceed this crash > integrity requirement. I just noticed that this test is not applied for generic filesystems, but for ones that support metadata_journaling. So, IMHO, we don't have to rely on POSIX too much here. Back to metadata_journalling, we indeed have to treat this as a bug in f2fs, since this recovery behavior sounds like defacto across the major filesystems. I submitted a patch to address this, based on the hint by Ted. https://patchwork.kernel.org/patch/10135085/ Thanks, > > IOWs, changing the test is the wrong thing to do on many levels.... > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com > -- > To unsubscribe from this list: send the line "unsubscribe fstests" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/28/17 1:09 AM, Dave Chinner wrote: ... > There's a whole lot more detail in the kernel commit 2be63d5ce929 > ("Btrfs: fix file loss on log replay after renaming a file and > fsync") but my point is that we considered this a btrfs filesystem > bug and so changing the test defeats it's purpose as a regression > test for the btrfs bug. > > So IMO the test should not be changed. And I think we should be > consistent and consider this f2fs failure as a f2fs bug that needs > fixing to bring it's behaviour in line with xfs, ext4, and btrfs. > > Remember this when quoting POSIX about fsync behaviour: Posix is a > terrible standard when it comes to data integrity. We go way, way > beyond what POSIX specifies as a valid fsync implementation (i.e. > posix allows "do nothing and return success" as a conformant > implementation). Ext4, XFS and btrfs all have strictly ordered > metadata crash recovery semantics and all of the crash recovery > tests expect this behaviour from the filesytem being tested. The > underlying intention is that by encoding it into these tests, all > widely used and future linux filesystems meet or exceed this crash > integrity requirement. > > IOWs, changing the test is the wrong thing to do on many levels.... > > Cheers, > > Dave. > Thanks for digging into the btrfs commit which changed the behavior tested by this testcase, I had meant to do that. Yeah, that's all fair, for sure. But this shouldn't be implicit in the testcase; it should explicitly document that it is testing a de-facto new standard adhered to by several filesystems, what that standard is, etc, and not leave it to the reader. ;) If a filesystem explicitly chooses not to adhere to that standard they /could/ opt out of the test. It's makes a lot of sense for linux filesystems to behave in consistent ways, but if we're going to start enforcing that through xfstests we need to be very clear about it, IMHO. -Eric -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Dec 28, 2017 at 08:59:18AM -0800, Eric Sandeen wrote: > On 12/28/17 1:09 AM, Dave Chinner wrote: > ... > > > There's a whole lot more detail in the kernel commit 2be63d5ce929 > > ("Btrfs: fix file loss on log replay after renaming a file and > > fsync") but my point is that we considered this a btrfs filesystem > > bug and so changing the test defeats it's purpose as a regression > > test for the btrfs bug. > > > > So IMO the test should not be changed. And I think we should be > > consistent and consider this f2fs failure as a f2fs bug that needs > > fixing to bring it's behaviour in line with xfs, ext4, and btrfs. > > > > Remember this when quoting POSIX about fsync behaviour: Posix is a > > terrible standard when it comes to data integrity. We go way, way > > beyond what POSIX specifies as a valid fsync implementation (i.e. > > posix allows "do nothing and return success" as a conformant > > implementation). Ext4, XFS and btrfs all have strictly ordered > > metadata crash recovery semantics and all of the crash recovery > > tests expect this behaviour from the filesytem being tested. The > > underlying intention is that by encoding it into these tests, all > > widely used and future linux filesystems meet or exceed this crash > > integrity requirement. > > > > IOWs, changing the test is the wrong thing to do on many levels.... > > > > Cheers, > > > > Dave. > > > > Thanks for digging into the btrfs commit which changed the behavior > tested by this testcase, I had meant to do that. > > Yeah, that's all fair, for sure. But this shouldn't be implicit in the > testcase; it should explicitly document that it is testing a de-facto > new standard adhered to by several filesystems, what that standard is, > etc, and not leave it to the reader. ;) Well, the point of the test case is to catch recovery behaviour that is not properly ordered in this complex corner case. It's not an obvious thing for developers to test, as we can see by both btrfs and now f2fs failing this test. What that failure actually means and implies is completely separate to the test itself. i.e. the test is just a canary that tells us a filesystem is not strictly ordered in it's crash recovery behaviour. If we don't have tests somewhere that exercise the behaviour we want filesystems to have, we'll never know if a filesystem does or doesn't behave as we want them to. The fact we'd like all filesystems to have strictly ordered crash recovery semantics has nothing to do with the test. It's based on the fact that XFS and ext3/4 have had this behaviour since forever, and apps written on these filesystems are quite likely to skip the directory fsync() because it's not necessary. That's seen by btrfs users reporting lost files where other filesystems are fine. We want all linux filesystems to behave the same way in these circumstances, but we can't enforce it because.... > If a filesystem explicitly chooses not to adhere to that standard they > /could/ opt out of the test. ... that is a option the filesystem developers can take. > It's makes a lot of sense for linux filesystems to behave in consistent > ways, but if we're going to start enforcing that through xfstests we > need to be very clear about it, IMHO. xfstests is not the way we enforce it - it's just tells us something is inconsistent. What the fs developers do from there (fix it or skip the test) is up to them. Cheers, Dave.
diff --git a/tests/generic/342 b/tests/generic/342 index ed64e05..6a9e5bd 100755 --- a/tests/generic/342 +++ b/tests/generic/342 @@ -68,6 +68,7 @@ sync mv $SCRATCH_MNT/a/foo $SCRATCH_MNT/a/bar $XFS_IO_PROG -f -c "pwrite -S 0xba 0 16K" $SCRATCH_MNT/a/foo | _filter_xfs_io $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/a/foo +$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/a echo "File digests before log replay:" md5sum $SCRATCH_MNT/a/foo | _filter_scratch