Message ID | cover.1742834133.git.dsterba@suse.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [GIT,PULL] Btrfs updates for 6.15 | expand |
The pull request you sent on Mon, 24 Mar 2025 17:37:51 +0100:
> git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git tags/for-6.15-tag
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/fd71def6d9abc5ae362fb9995d46049b7b0ed391
Thank you!
On Mon, Mar 24, 2025 at 05:37:51PM +0100, David Sterba wrote: > Hi, > > please pull the following btrfs updates, thanks. > > User visible changes: > > - fall back to buffered write if direct io is done on a file that requires > checksums <trimming the everybody linux-btrfs from the cc list> What? We use this constantly in a bunch of places to avoid page cache overhead, it's perfectly legitimate to do DIO to a file that requires checksums. Does the vm case mess this up? Absolutely, but that's why we say use NOCOW for that case. We've always had this behavior, we've always been clear that if you break it you buy it. This is a huge regression for a pretty significant use case. Thanks, Josef
On Fri, Mar 28, 2025 at 09:27:51AM -0400, Josef Bacik wrote: > On Mon, Mar 24, 2025 at 05:37:51PM +0100, David Sterba wrote: > > Hi, > > > > please pull the following btrfs updates, thanks. > > > > User visible changes: > > > > - fall back to buffered write if direct io is done on a file that requires > > checksums > > <trimming the everybody linux-btrfs from the cc list> > > What? We use this constantly in a bunch of places to avoid page cache overhead, > it's perfectly legitimate to do DIO to a file that requires checksums. Does the > vm case mess this up? Absolutely, but that's why we say use NOCOW for that > case. We've always had this behavior, we've always been clear that if you break > it you buy it. This is a huge regression for a pretty significant use case. The patch has been up for like 2 months and you could have said "don't because reasons" any time before the pull request. Now we're left with a revert, or other alternatives making the use cases working.
On Fri, Mar 28, 2025 at 06:36:44PM +0100, David Sterba wrote: > On Fri, Mar 28, 2025 at 09:27:51AM -0400, Josef Bacik wrote: > > On Mon, Mar 24, 2025 at 05:37:51PM +0100, David Sterba wrote: > > > Hi, > > > > > > please pull the following btrfs updates, thanks. > > > > > > User visible changes: > > > > > > - fall back to buffered write if direct io is done on a file that requires > > > checksums > > > > <trimming the everybody linux-btrfs from the cc list> > > > > What? We use this constantly in a bunch of places to avoid page cache overhead, > > it's perfectly legitimate to do DIO to a file that requires checksums. Does the > > vm case mess this up? Absolutely, but that's why we say use NOCOW for that > > case. We've always had this behavior, we've always been clear that if you break > > it you buy it. This is a huge regression for a pretty significant use case. > > The patch has been up for like 2 months and you could have said "don't > because reasons" any time before the pull request. Now we're left with a > revert, or other alternatives making the use cases working. Boris told me about this and I forgot. I took everybody off the CC list because I don't want to revert it, in fact generally speaking I'd love to never have these style of bug reports again. But it is a pretty significant change. Are we ok going forward saying you don't get O_DIRECT unless you want NOCOW? Should we maybe allow for users to indicate they're not dumb and can be trusted to do O_DIRECT properly? I just think this opens us up to a lot more uncomfortable conversations than the other behavior. I personally think this is better, if it's been sitting there for 2 months then hooray we're in agreement. But I'm also worried it'll come back to bite us. Thanks, Josef
在 2025/3/29 06:09, Josef Bacik 写道: > On Fri, Mar 28, 2025 at 06:36:44PM +0100, David Sterba wrote: >> On Fri, Mar 28, 2025 at 09:27:51AM -0400, Josef Bacik wrote: >>> On Mon, Mar 24, 2025 at 05:37:51PM +0100, David Sterba wrote: >>>> Hi, >>>> >>>> please pull the following btrfs updates, thanks. >>>> >>>> User visible changes: >>>> >>>> - fall back to buffered write if direct io is done on a file that requires >>>> checksums >>> >>> <trimming the everybody linux-btrfs from the cc list> >>> >>> What? We use this constantly in a bunch of places to avoid page cache overhead, >>> it's perfectly legitimate to do DIO to a file that requires checksums. Does the >>> vm case mess this up? Absolutely, but that's why we say use NOCOW for that >>> case. We've always had this behavior, we've always been clear that if you break >>> it you buy it. This is a huge regression for a pretty significant use case. BTW, it's not clear, it's not in the documents (until recently, along with this change), and no one is really explaining that to end users until now. If you're a experienced fs/mm developer, sure it's obvious, but most users (millions of VM users who choose to use none cache mode) are not so experienced. Future more, a lot of such VM users are just trying to reduce page cache memory usage, not so about the zero-copy performance. And the buffered fallback should still make a lot of sense for them, and still allow working data checksum (which they may care more than the performance). >> >> The patch has been up for like 2 months and you could have said "don't >> because reasons" any time before the pull request. Now we're left with a >> revert, or other alternatives making the use cases working. > > Boris told me about this and I forgot. I took everybody off the CC list because > I don't want to revert it, in fact generally speaking I'd love to never have > these style of bug reports again. > > But it is a pretty significant change. Are we ok going forward saying you don't > get O_DIRECT unless you want NOCOW? Despite NOCOW, there are still other things (like alignment) can make O_DIRECT to fall back buffered, without even letting the users know. And it is always fs specific on whether it choose to fall buffered or return -EINVAL. So in the first place there is never any guarantee that O_DIRECT always results in zero-copy writes. > Should we maybe allow for users to indicate > they're not dumb and can be trusted to do O_DIRECT properly? I just think this > opens us up to a lot more uncomfortable conversations than the other behavior. For whatever reason, no matter dumb users or not, all operations through a fs should not lead to data/metadata corruption, no matter if it's real corruption or just false csum mismatch. (Not to mention it's not easy to fix the csum mismatch either) If dumb programs can still corrupt the fs, then it's a bug in the fs. To me, this is very straightforward, and the priority should be obvious. > > I personally think this is better, if it's been sitting there for 2 months then > hooray we're in agreement. But I'm also worried it'll come back to bite us. I'd say, let it bite. If it's some performance critical use cases, I believe the developer should be reasonable enough to be persuaded (either choose perf and go NODATASUM, or choose datacsum and accept the perf drop). If it's newbie level usages, they shouldn't even notice any difference, except the fact btrfs no longer corrupts their files. Even if it really bites in the future, I doubt it will be any worse. On the other hand, the data csum mismatch for VM images are already biting us for too long, and it's really damaging the reputation of Btrfs. Thanks, Qu > Thanks, > > Josef