Message ID | 20200117140224.42495-1-josef@toxicpanda.com (mailing list archive) |
---|---|
Headers | show |
Series | btrfs: fix hole corruption issue with !NO_HOLES | expand |
On Fri, Jan 17, 2020 at 09:02:18AM -0500, Josef Bacik wrote: > v2->v3: > - Rebased onto misc-next. > - Added a patch to stop using ->leave_spinning in truncate_inode_items. > > v1->v2: > - fixed a bug in 'btrfs: use the file extent tree infrastructure' that would > result in 0 length files because btrfs_truncate_inode_items() was clearing the > file extent map when we fsync'ed multiple times. Validated this with a > modified fsx and generic/521 that reproduced the problem, those modifications > were sent up as well. > - dropped the RFC > > ----------------- Original Message ----------------------- > We've historically had this problem where you could flush a targeted section of > an inode and end up with a hole between extents without a hole extent item. > This of course makes fsck complain because this is not ok for a file system that > doesn't have NO_HOLES set. Because this is a well understood problem I and > others have been ignoring fsck failures during certain xfstests (generic/475 for > example) because they would regularly trigger this edge case. > > However this isn't a great behavior to have, we should really be taking all fsck > failures seriously, and we could potentially ignore fsck legitimate fsck errors > because we expect it to be this particular failure. > > In order to fix this we need to keep track of where we have valid extent items, > and only update i_size to encompass that area. This unfortunately means we need > a new per-inode extent_io_tree to keep track of the valid ranges. This is > relatively straightforward in practice, and helpers have been added to manage > this so that in the case of a NO_HOLES file system we just simply skip this work > altogether. > > I've been hammering on this for a week now and I'm pretty sure its ok, but I'd > really like Filipe to take a look and I still have some longer running tests > going on the series. All of our boxes internally are btrfs and the box I was > testing on ended up with a weird RPM db corruption that was likely from an > earlier, broken version of the patch. However I cannot be 100% sure that was > the case, so I'm giving it a few more days of testing before I'm satisfied > there's not some weird thing that RPM does that xfstests doesn't cover. > > This has gone through several iterations of xfstests already, including many > loops of generic/475 for validation to make sure it was no longer failing. So > far so good, but for something like this wider testing will definitely be > necessary. Thanks, I've reviewed the series and will add it to for-next. The i_size tracking seems to be an important part that we want to merge before NO_HOLES is default in mkfs. It would be good to steer more focus on that during testing. If everything goes fine, the mkfs default can happen in 5.7. Thanks.
On Mon, Feb 3, 2020 at 7:11 PM David Sterba <dsterba@suse.cz> wrote: > > On Fri, Jan 17, 2020 at 09:02:18AM -0500, Josef Bacik wrote: > > v2->v3: > > - Rebased onto misc-next. > > - Added a patch to stop using ->leave_spinning in truncate_inode_items. > > > > v1->v2: > > - fixed a bug in 'btrfs: use the file extent tree infrastructure' that would > > result in 0 length files because btrfs_truncate_inode_items() was clearing the > > file extent map when we fsync'ed multiple times. Validated this with a > > modified fsx and generic/521 that reproduced the problem, those modifications > > were sent up as well. > > - dropped the RFC > > > > ----------------- Original Message ----------------------- > > We've historically had this problem where you could flush a targeted section of > > an inode and end up with a hole between extents without a hole extent item. > > This of course makes fsck complain because this is not ok for a file system that > > doesn't have NO_HOLES set. Because this is a well understood problem I and > > others have been ignoring fsck failures during certain xfstests (generic/475 for > > example) because they would regularly trigger this edge case. > > > > However this isn't a great behavior to have, we should really be taking all fsck > > failures seriously, and we could potentially ignore fsck legitimate fsck errors > > because we expect it to be this particular failure. > > > > In order to fix this we need to keep track of where we have valid extent items, > > and only update i_size to encompass that area. This unfortunately means we need > > a new per-inode extent_io_tree to keep track of the valid ranges. This is > > relatively straightforward in practice, and helpers have been added to manage > > this so that in the case of a NO_HOLES file system we just simply skip this work > > altogether. > > > > I've been hammering on this for a week now and I'm pretty sure its ok, but I'd > > really like Filipe to take a look and I still have some longer running tests > > going on the series. All of our boxes internally are btrfs and the box I was > > testing on ended up with a weird RPM db corruption that was likely from an > > earlier, broken version of the patch. However I cannot be 100% sure that was > > the case, so I'm giving it a few more days of testing before I'm satisfied > > there's not some weird thing that RPM does that xfstests doesn't cover. > > > > This has gone through several iterations of xfstests already, including many > > loops of generic/475 for validation to make sure it was no longer failing. So > > far so good, but for something like this wider testing will definitely be > > necessary. Thanks, > > I've reviewed the series and will add it to for-next. The i_size > tracking seems to be an important part that we want to merge before > NO_HOLES is default in mkfs. Can you elaborate a bit? I don't understand why you say this is important in order to make NO_HOLES a default. This series fixes a problem that only happens when not using the NO_HOLES feature. Thanks. > It would be good to steer more focus on > that during testing. If everything goes fine, the mkfs default can > happen in 5.7. Thanks.