Message ID | cover.1660690698.git.osandov@fb.com (mailing list archive) |
---|---|
Headers | show |
Series | btrfs: fix filesystem corruption caused by space cache race | expand |
On Tue, Aug 16, 2022 at 04:12:14PM -0700, Omar Sandoval wrote: > From: Omar Sandoval <osandov@fb.com> > > Hello, > > We recently deployed space_cache v2 on a large set of machines to do > some performance comparisons and found a nasty filesystem corruption bug > that was introduced by the fsync transaction optimization in 5.12. It's > much more likely to affect space_cache=v2 and nospace_cache, but since > space_cache=v1 effectively falls back to nospace_cache if there is a > free space inode generation mismatch, space_cache=v1 could also > theoretically be affected. discard/discard=sync also makes the bug much > easier to hit by making the race window larger. > > Patch 1 is the fix itself with a lot more details. Patch 2 is a followup > cleanup. > > I'm still working on a reproducer, but I wanted to get this fix out > ASAP. > > Thanks! > > Omar Sandoval (2): > btrfs: fix space cache corruption and potential double allocations > btrfs: get rid of block group caching progress logic Added to misc-next, thanks. A backport for 5.15 would be needed, the patch does not apply cleanly.
On Tue, Aug 16, 2022 at 04:12:14PM -0700, Omar Sandoval wrote: > From: Omar Sandoval <osandov@fb.com> > > Hello, > > We recently deployed space_cache v2 on a large set of machines to do > some performance comparisons and found a nasty filesystem corruption bug > that was introduced by the fsync transaction optimization in 5.12. It's > much more likely to affect space_cache=v2 and nospace_cache, but since > space_cache=v1 effectively falls back to nospace_cache if there is a > free space inode generation mismatch, space_cache=v1 could also > theoretically be affected. discard/discard=sync also makes the bug much > easier to hit by making the race window larger. > > Patch 1 is the fix itself with a lot more details. Patch 2 is a followup > cleanup. > > I'm still working on a reproducer, but I wanted to get this fix out > ASAP. > > Thanks! > > Omar Sandoval (2): > btrfs: fix space cache corruption and potential double allocations > btrfs: get rid of block group caching progress logic The patches apply cleanly on misc-next but if you want this fixed in 6.0 I'll need a backported version, and then misc-next will be rebased on top of that.
Any chance to get this ASAP in the stable kernels? Cheers, Chris.
On Tue, Aug 23, 2022 at 09:20:57PM +0200, Christoph Anton Mitterer wrote:
> Any chance to get this ASAP in the stable kernels?
Yes, it'll be in the -rc3 batch and then it gets picked to stable within
a week, depending on the stable release schedule.
From: Omar Sandoval <osandov@fb.com> Hello, We recently deployed space_cache v2 on a large set of machines to do some performance comparisons and found a nasty filesystem corruption bug that was introduced by the fsync transaction optimization in 5.12. It's much more likely to affect space_cache=v2 and nospace_cache, but since space_cache=v1 effectively falls back to nospace_cache if there is a free space inode generation mismatch, space_cache=v1 could also theoretically be affected. discard/discard=sync also makes the bug much easier to hit by making the race window larger. Patch 1 is the fix itself with a lot more details. Patch 2 is a followup cleanup. I'm still working on a reproducer, but I wanted to get this fix out ASAP. Thanks! Omar Sandoval (2): btrfs: fix space cache corruption and potential double allocations btrfs: get rid of block group caching progress logic fs/btrfs/block-group.c | 49 +++++++++++--------------------------- fs/btrfs/block-group.h | 5 +--- fs/btrfs/extent-tree.c | 39 +++++++----------------------- fs/btrfs/free-space-tree.c | 8 ------- fs/btrfs/transaction.c | 41 ------------------------------- fs/btrfs/zoned.c | 1 - 6 files changed, 23 insertions(+), 120 deletions(-)