diff mbox

[git,pull] first batch of ufs fixes

Message ID 20170610160738.GE6365@ZenIV.linux.org.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Al Viro June 10, 2017, 4:07 p.m. UTC
On Sat, Jun 10, 2017 at 06:03:24AM -0700, Richard Narron wrote:

> 2) After creating a new filesystem on FreeBSD, then on Linux copying a
> larger than 2GB file and creating a directory, the fsck back on FreeBSD
> looks ok.
> 
> But after going back to Linux and removing the large file and removing the
> directory, the fsck on FreeBSD looks not so good:

What happens is ufs_evict_inode() buggering off without syncing the inode
in case of final removal.  Incremental on top of that branch is

Committed and pushed out...

Comments

Al Viro June 10, 2017, 6:08 p.m. UTC | #1
On Sat, Jun 10, 2017 at 05:07:38PM +0100, Al Viro wrote:
> On Sat, Jun 10, 2017 at 06:03:24AM -0700, Richard Narron wrote:
> 
> > 2) After creating a new filesystem on FreeBSD, then on Linux copying a
> > larger than 2GB file and creating a directory, the fsck back on FreeBSD
> > looks ok.
> > 
> > But after going back to Linux and removing the large file and removing the
> > directory, the fsck on FreeBSD looks not so good:
> 
> What happens is ufs_evict_inode() buggering off without syncing the inode
> in case of final removal.  Incremental on top of that branch is
> diff --git a/fs/ufs/inode.c b/fs/ufs/inode.c
> index 34f11cf0900a..da553ffec85b 100644
> --- a/fs/ufs/inode.c
> +++ b/fs/ufs/inode.c
> @@ -848,6 +848,7 @@ void ufs_evict_inode(struct inode * inode)
>  		    (S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
>  		     S_ISLNK(inode->i_mode)))
>  			ufs_truncate_blocks(inode);
> +		ufs_update_inode(inode, inode_needs_sync(inode));
>  	}
>  
>  	invalidate_inode_buffers(inode);
> 
> Committed and pushed out...

BTW, should I send an updated pull request in such situation?  It's the
same branch with the one-liner above added on top, head should be
at 67a70017fa0a152657bc7e337e69bb9c9f5549bf, stats
Al Viro (8):
      ufs: restore proper tail allocation
      fix ufs_isblockset()
      ufs: restore maintaining ->i_blocks
      ufs: set correct ->s_maxsize
      ufs_extend_tail(): fix the braino in calling conventions of ufs_new_fragments()
      ufs_getfrag_block(): we only grab ->truncate_mutex on block creation path
      excessive checks in ufs_write_failed() and ufs_evict_inode()
      ufs: we need to sync inode before freeing it

 fs/stat.c       |  1 +
 fs/ufs/balloc.c | 26 +++++++++++++++++++++++++-
 fs/ufs/inode.c  | 28 ++++++++++++----------------
 fs/ufs/super.c  | 18 ++++++++++++++++++
 fs/ufs/util.h   | 10 +++++++---
 5 files changed, 63 insertions(+), 20 deletions(-)

If you prefer the full git-request-pull output (I've no idea what kind
of scripts you are using and how much PITA do they prevent), please
yell.
Linus Torvalds June 10, 2017, 6:12 p.m. UTC | #2
On Sat, Jun 10, 2017 at 11:08 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> BTW, should I send an updated pull request in such situation?

It's better if you do, although in this case it was obvious that you'd
just added a single line and I could see the diffstat still match with
that addition.

But in general it just makes things easier for me when I see that
updated pull request, and it is obvious that "yes, Al clearly meant me
to pull that, despite it not matching the original pull request".

                   Linus
Richard Narron June 11, 2017, 7:47 p.m. UTC | #3
The ufs fixes are looking good, much better than what we started with.

After applying the 8 ufs patches:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
/commit/?id=5faab9e0f03c4eef97886b45436015e107f79f5f

Copying files to FreeBSD 11.0 (ufs2) and NetBSD 7.1 (ufs2) looks
pretty good.

I can copy a large >2GB file, create a directory and then run and 
fsck without errors with a FreeBSD disk and a NetBSD disk.

But...

1) Creating an OpenBSD 6.1 (44bsd) disk and then on Linux copying a 
large > 2GB file and creating a directory, there are errors on OpenBSD
with the fsck:

OpenBSD (44bsd):
   #fsck sd0e
   ** /dev/rsd0e
   ** File system is already clean
   ** Last Mounted on /diske
   ** Phase 1 - Check Blocks and Sizes
   ** Phase 2 - Check Pathnames
   ** Phase 3 - Check Connectivity
   ** Phase 4 - Check Reference Counts
   ** Phase 5 - Check Cyl groups
   FREE BLK COUNT(S) WRONG IN SUPERBLK
   SALVAGE? [Fyn?]
   3 files, 1272410 used, 6983197 free (13 frags, 872898 blocks, 0.0% fragmentation)

   ***** FILE SYSTEM WAS MODIFIED *****

And also after removing the files on Linux, there were errors in
the fsck on OpenBSD:

   #fsck sd0e
   ** /dev/rsd0e
   ** File system is already clean
   ** Last Mounted on /diske
   ** Phase 1 - Check Blocks and Sizes
   ** Phase 2 - Check Pathnames
   ** Phase 3 - Check Connectivity
   ** Phase 4 - Check Reference Counts
   ** Phase 5 - Check Cyl groups
   FREE BLK COUNT(S) WRONG IN SUPERBLK
   SALVAGE? [Fyn?]
   1 files, 1 used, 8255606 free (14 frags, 1031949 blocks, 0.0% fragmentation)

   ***** FILE SYSTEM WAS MODIFIED *****

2) The available block counts on a newly created FreeBSD 11.0
     ufs2 filesystem (13079640) differ from Linux 4.12-rc4 (13079656):

Freebsd (ufs2):
   #df /diske
   Filesystem   1K-blocks Used    Avail Capacity  Mounted on
   /dev/ada0s3e  14217008    8 13079640     0%    /diske
Linux:
   #df /fbsd23
   Filesystem     1K-blocks  Used Available Use% Mounted on
   /dev/sda23      14217008     8  13079656   1% /fbsd23

3) The available block counts on a newly created NetBSD 7.1
     ufs2 filesystem (9518316) differ from Linux 4.12-rc4 (9518314):

NetBSD (ufs2):
   #df /nbsdf
   Filesystem     1K-blocks  Used Available Use% Mounted on
   /dev/sda22      10019278     2   9518316   1% /nbsdf
Linux:
   #df /diskf
   Filesystem    1K-blocks       Used      Avail %Cap Mounted on
   /dev/wd0f      10019278          2    9518314   0% /diskf
Al Viro June 11, 2017, 9:30 p.m. UTC | #4
On Sun, Jun 11, 2017 at 12:47:40PM -0700, Richard Narron wrote:

> 1) Creating an OpenBSD 6.1 (44bsd) disk and then on Linux copying a large >
> 2GB file and creating a directory, there are errors on OpenBSD
> with the fsck:
> 
> OpenBSD (44bsd):
>   #fsck sd0e
>   ** /dev/rsd0e
>   ** File system is already clean
>   ** Last Mounted on /diske
>   ** Phase 1 - Check Blocks and Sizes
>   ** Phase 2 - Check Pathnames
>   ** Phase 3 - Check Connectivity
>   ** Phase 4 - Check Reference Counts
>   ** Phase 5 - Check Cyl groups
>   FREE BLK COUNT(S) WRONG IN SUPERBLK
>   SALVAGE? [Fyn?]
>   3 files, 1272410 used, 6983197 free (13 frags, 872898 blocks, 0.0% fragmentation)
> 
>   ***** FILE SYSTEM WAS MODIFIED *****
> 
> And also after removing the files on Linux, there were errors in
> the fsck on OpenBSD:
> 
>   #fsck sd0e
>   ** /dev/rsd0e
>   ** File system is already clean
>   ** Last Mounted on /diske
>   ** Phase 1 - Check Blocks and Sizes
>   ** Phase 2 - Check Pathnames
>   ** Phase 3 - Check Connectivity
>   ** Phase 4 - Check Reference Counts
>   ** Phase 5 - Check Cyl groups
>   FREE BLK COUNT(S) WRONG IN SUPERBLK
>   SALVAGE? [Fyn?]
>   1 files, 1 used, 8255606 free (14 frags, 1031949 blocks, 0.0% fragmentation)
> 
>   ***** FILE SYSTEM WAS MODIFIED *****

Interesting...  The main difference between UFS1 and UFS2 in that area
is the misbegotten rotation latency optimizations (always ugly as hell,
pointless by mid-90s, dropped completely in UFS2 and dropped in FreeBSD
UFS1 in 2002, at the same time when Kirk had merged UFS2).  I hadn't
looked at OpenBSD situation, but their filesystem and VFS side tends
to be, er, somewhat antique.  So that would be my first guess; I'll
resurrect openbsd-in-KVM setup here and see what's really going on...

> 2) The available block counts on a newly created FreeBSD 11.0
>     ufs2 filesystem (13079640) differ from Linux 4.12-rc4 (13079656):
> 
> Freebsd (ufs2):
>   #df /diske
>   Filesystem   1K-blocks Used    Avail Capacity  Mounted on
>   /dev/ada0s3e  14217008    8 13079640     0%    /diske
> Linux:
>   #df /fbsd23
>   Filesystem     1K-blocks  Used Available Use% Mounted on
>   /dev/sda23      14217008     8  13079656   1% /fbsd23

Linux:
        buf->f_bavail = (buf->f_bfree > (((long)buf->f_blocks / 100) * uspi->s_minfree))
                ? (buf->f_bfree - (((long)buf->f_blocks / 100) * uspi->s_minfree)) : 0;
FreeBSD:
	sbp->f_bavail = freespace(fs, fs->fs_minfree) +
            dbtofsb(fs, fs->fs_pendingblocks);
#define freespace(fs, percentreserved) \
        (blkstofrags((fs), (fs)->fs_cstotal.cs_nbfree) + \
        (fs)->fs_cstotal.cs_nffree - \
        (((off_t)((fs)->fs_dsize)) * (percentreserved) / 100))

Note that all those values are in units of fragments, so the size is
not 14217008, it's 3554252.  Now, you clearly have minfree at 8 and
we get (3554252/100)*8 vs. (3554252*8)/100, i.e. 284336 and 284340
respectively.  There's your 16Kb of difference...

> 3) The available block counts on a newly created NetBSD 7.1
>     ufs2 filesystem (9518316) differ from Linux 4.12-rc4 (9518314):
> 
> NetBSD (ufs2):
>   #df /nbsdf
>   Filesystem     1K-blocks  Used Available Use% Mounted on
>   /dev/sda22      10019278     2   9518316   1% /nbsdf
> Linux:
>   #df /diskf
>   Filesystem    1K-blocks       Used      Avail %Cap Mounted on
>   /dev/wd0f      10019278          2    9518314   0% /diskf

Huh?  The other way round?  I'm fairly sure you've mislabeled those -
/dev/wd* makes a plausible NetBSD device name, but /dev/sda22 doesn't.
And "Use%" vs "%Cap" also points in the same direction.  If the first
one is on Linux and the second on NetBSD, I'm fairly certain that it's
the same story.  And if it is, I'm not sure it's worth worrying about -
the difference will never be greater than minfree * fragment size, and
that's pretty much noise.
Al Viro June 12, 2017, 6:14 a.m. UTC | #5
On Sun, Jun 11, 2017 at 12:47:40PM -0700, Richard Narron wrote:
> 1) Creating an OpenBSD 6.1 (44bsd) disk and then on Linux copying a large >
> 2GB file and creating a directory, there are errors on OpenBSD
> with the fsck:
> 
> OpenBSD (44bsd):
>   #fsck sd0e
>   ** /dev/rsd0e
>   ** File system is already clean
>   ** Last Mounted on /diske
>   ** Phase 1 - Check Blocks and Sizes
>   ** Phase 2 - Check Pathnames
>   ** Phase 3 - Check Connectivity
>   ** Phase 4 - Check Reference Counts
>   ** Phase 5 - Check Cyl groups
>   FREE BLK COUNT(S) WRONG IN SUPERBLK
>   SALVAGE? [Fyn?]
>   3 files, 1272410 used, 6983197 free (13 frags, 872898 blocks, 0.0% fragmentation)
> 
>   ***** FILE SYSTEM WAS MODIFIED *****

Can't reproduce...
# fsck -f /dev/rwd1c
** /dev/rwd1c
** File system is already clean
** Last Mounted on
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
3 files, 1573258 used, 489445 free (13 frags, 61179 blocks, 0.0% fragmentation)
# uname -mrsv
OpenBSD 6.1 GENERIC#19 amd64
# file -s /dev/rwd1c 
/dev/rwd1c: Unix Fast File system [v1] (little-endian), last mounted on , last written at Mon Jun 12 06:03:57 2017, clean flag 1, number of blocks 2097152, number of data blocks 2062703, number of cylinder groups 21, block size 16384, fragment size 2048, minimum percentage of free blocks 5, rotational delay 0ms, disk rotational speed 60rps, TIME optimization

That's after
# mkdir /mnt/a; dd if=/dev/zero of=/mnt/a/foo bs=1M count=3072
on the Linux side, with
root@kvm1:~# uname -msrv
Linux 4.12.0-rc1+ #112 SMP Fri Jun 9 17:16:00 EDT 2017 x86_64
there.  -o loop mount on Linux (image living on 9p), direct -hdb ../9p/ufs on OpenBSD side
of things (both in KVM on the same host)...

Reproducer would be nice - ideally on the level of "device is this large, newfs was with
such and such options, series of operations done on the Linux side after mounting that
sucker".
Richard Narron June 13, 2017, 12:54 a.m. UTC | #6
On Mon, 12 Jun 2017, Al Viro wrote:

> On Sun, Jun 11, 2017 at 12:47:40PM -0700, Richard Narron wrote:
>> 1) Creating an OpenBSD 6.1 (44bsd) disk and then on Linux copying a large >
>> 2GB file and creating a directory, there are errors on OpenBSD
>> with the fsck:
>>
>
> Can't reproduce...
> # fsck -f /dev/rwd1c
> ** /dev/rwd1c
> ** File system is already clean
> ** Last Mounted on
> ** Phase 1 - Check Blocks and Sizes
> ** Phase 2 - Check Pathnames
> ** Phase 3 - Check Connectivity
> ** Phase 4 - Check Reference Counts
> ** Phase 5 - Check Cyl groups
> 3 files, 1573258 used, 489445 free (13 frags, 61179 blocks, 0.0% fragmentation)
> # uname -mrsv
> OpenBSD 6.1 GENERIC#19 amd64
> # file -s /dev/rwd1c
> /dev/rwd1c: Unix Fast File system [v1] (little-endian), last mounted on , last written at Mon Jun 12 06:03:57 2017, clean flag 1, number of blocks 2097152, number of data blocks 2062703, number of cylinder groups 21, block size 16384, fragment size 2048, minimum percentage of free blocks 5, rotational delay 0ms, disk rotational speed 60rps, TIME optimization
>
> That's after
> # mkdir /mnt/a; dd if=/dev/zero of=/mnt/a/foo bs=1M count=3072
> on the Linux side, with
> root@kvm1:~# uname -msrv
> Linux 4.12.0-rc1+ #112 SMP Fri Jun 9 17:16:00 EDT 2017 x86_64
> there.  -o loop mount on Linux (image living on 9p), direct -hdb ../9p/ufs on OpenBSD side
> of things (both in KVM on the same host)...

Earlier today I could not reproduce the OpenBSD 6.1 ufs1 fsck error after 
Linux 4.12-rc5 copy of my >2GB file using "cp".

But later today I get the error when I copy using your "dd" method...

In any case I always get a ufs1 fsck error after the Linux rm and rmdir.

Here is the error after rm of the files created by the "dd" method:

#/usr/bin/uname -mrsv
OpenBSD 6.1 GENERIC.MP#6 amd64
#/sbin/fsck -n -f /dev/sd0e
** /dev/rsd0e (NO WRITE)
** File system is already clean
** Last Mounted on /diske
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? no

1 files, 1 used, 6682349 free (13 frags, 835292 blocks, 0.0% 
fragmentation)
#/sbin/fsck -f /dev/sd0e
** /dev/rsd0e
** File system is already clean
** Last Mounted on /diske
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? [Fyn?]
1 files, 1 used, 8255606 free (14 frags, 1031949 blocks, 0.0% 
fragmentation)

***** FILE SYSTEM WAS MODIFIED *****
#/sbin/mount /dev/sd0e
#/bin/df /diske
Filesystem  512-blocks      Used     Avail Capacity  Mounted on
/dev/sd0e     33022428         4  31371304     0%    /diske
#/sbin/mount /dev/sd0e
#/bin/df /diske
Filesystem  512-blocks      Used     Avail Capacity  Mounted on
/dev/sd0e     33022428         4  31371304     0%    /diske

OpenBSD 6.1 is not very stable for me.

I will test FreeBSD and NetBSD ufs1 to see if they have a problem...
Al Viro June 13, 2017, 1:43 a.m. UTC | #7
On Mon, Jun 12, 2017 at 05:54:06PM -0700, Richard Narron wrote:

> Earlier today I could not reproduce the OpenBSD 6.1 ufs1 fsck error after
> Linux 4.12-rc5 copy of my >2GB file using "cp".
> 
> But later today I get the error when I copy using your "dd" method...
> 
> In any case I always get a ufs1 fsck error after the Linux rm and rmdir.

Interesting...  Could you put together an image (starting with zeroing the
device before newfs, and ideally with dd from /dev/zero to create files)
that would
	a) pass fsck on OpenBSD
	b) after rm on Linux fail the same
then convert it to qcow2 and publish?  Or just compress it - all free and
data blocks would contain only zeroes, so any kind of compression (gzip,
bzip2, whatever) would reduce the size to something more managable...
Richard Narron June 13, 2017, 9:56 p.m. UTC | #8
On Tue, 13 Jun 2017, Al Viro wrote:

> On Mon, Jun 12, 2017 at 05:54:06PM -0700, Richard Narron wrote:
>
>> Earlier today I could not reproduce the OpenBSD 6.1 ufs1 fsck error after
>> Linux 4.12-rc5 copy of my >2GB file using "cp".
>>
>> But later today I get the error when I copy using your "dd" method...
>>
>> In any case I always get a ufs1 fsck error after the Linux rm and rmdir.
>
> Interesting...  Could you put together an image (starting with zeroing the
> device before newfs, and ideally with dd from /dev/zero to create files)
> that would
> 	a) pass fsck on OpenBSD
> 	b) after rm on Linux fail the same
> then convert it to qcow2 and publish?  Or just compress it - all free and
> data blocks would contain only zeroes, so any kind of compression (gzip,
> bzip2, whatever) would reduce the size to something more managable...

I created a gzip and sent you an email with the link to a UFS1 OpenBSD 
filesytem image.

I finished simple testing of UFS1 with FreeBSD and NetBSD and found no 
problems except for the differences between "available" blocks in df 
commands.

And testing UFS2 was fine with all 3 systems, FreeBSD, OpenBSD and 
NetBSD.
diff mbox

Patch

diff --git a/fs/ufs/inode.c b/fs/ufs/inode.c
index 34f11cf0900a..da553ffec85b 100644
--- a/fs/ufs/inode.c
+++ b/fs/ufs/inode.c
@@ -848,6 +848,7 @@  void ufs_evict_inode(struct inode * inode)
 		    (S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
 		     S_ISLNK(inode->i_mode)))
 			ufs_truncate_blocks(inode);
+		ufs_update_inode(inode, inode_needs_sync(inode));
 	}
 
 	invalidate_inode_buffers(inode);