Message ID | cover.1709356594.git.ritesh.list@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | ext4: Add direct-io atomic write support using fsawu | expand |
On 02/03/2024 07:41, Ritesh Harjani (IBM) wrote: > Hello all, > > This RFC series adds support for atomic writes to ext4 direct-io using > filesystem atomic write unit. It's built on top of John's "block atomic > write v5" series which adds RWF_ATOMIC flag interface to pwritev2() and enables > atomic write support in underlying device driver and block layer. > > This series uses the same RWF_ATOMIC interface for adding atomic write support > to ext4's direct-io path. One can utilize it by 2 of the methods explained below. > ((1)mkfs.ext4 -b <BS>, (2) with bigalloc). > > Filesystem atomic write unit (fsawu): > ============================================ > Atomic writes within ext4 can be supported using below 3 methods - > 1. On a large pagesize system (e.g. Power with 64k pagesize or aarch64 with 64k pagesize), > we can mkfs using different blocksizes. e.g. mkfs.ext4 -b <4k/8k/16k/32k/64k). > Now if the underlying HW device supports atomic writes, than a corresponding > blocksize can be chosen as a filesystem atomic write unit (fsawu) which > should be within the underlying hw defined [awu_min, awu_max] range. > For such filesystem, fsawu_[min|max] both are equal to blocksize (e.g. 16k) > > On a smaller pagesize system this can be utilized when support for LBS is > complete (on ext4). > > 2. EXT4 already supports a feature called bigalloc. In that ext4 can handle > allocation in cluster size units. So for e.g. we can create a filesystem with > 4k blocksize but with 64k clustersize. Such a configuration can also be used > to support atomic writes if the underlying hw device supports it. > In such case the fsawu_min will most likely be the filesystem blocksize and > fsawu_max will mostly likely be the cluster size. > > So a user can do an atomic write of any size between [fsawu_min, fsawu_max] > range as long as it satisfies other constraints being laid out by HW device > (or by software stack) to support atomic writes. > e.g. len should be a power of 2, pos % len should be naturally > aligned and [start | end] (phys offsets) should not straddle over > an atomic write boundary. JFYI, I gave this a quick try, and it seems to work ok. Naturally it suffers from the same issue discussed at https://lore.kernel.org/linux-fsdevel/434c570e-39b2-4f1c-9b49-ac5241d310ca@oracle.com/ with regards to writing to partially written extents, which I have tried to address properly in my v2 for that same series. Thanks, John
John Garry <john.g.garry@oracle.com> writes: > On 02/03/2024 07:41, Ritesh Harjani (IBM) wrote: >> Hello all, >> >> This RFC series adds support for atomic writes to ext4 direct-io using >> filesystem atomic write unit. It's built on top of John's "block atomic >> write v5" series which adds RWF_ATOMIC flag interface to pwritev2() and enables >> atomic write support in underlying device driver and block layer. >> >> This series uses the same RWF_ATOMIC interface for adding atomic write support >> to ext4's direct-io path. One can utilize it by 2 of the methods explained below. >> ((1)mkfs.ext4 -b <BS>, (2) with bigalloc). >> >> Filesystem atomic write unit (fsawu): >> ============================================ >> Atomic writes within ext4 can be supported using below 3 methods - >> 1. On a large pagesize system (e.g. Power with 64k pagesize or aarch64 with 64k pagesize), >> we can mkfs using different blocksizes. e.g. mkfs.ext4 -b <4k/8k/16k/32k/64k). >> Now if the underlying HW device supports atomic writes, than a corresponding >> blocksize can be chosen as a filesystem atomic write unit (fsawu) which >> should be within the underlying hw defined [awu_min, awu_max] range. >> For such filesystem, fsawu_[min|max] both are equal to blocksize (e.g. 16k) >> >> On a smaller pagesize system this can be utilized when support for LBS is >> complete (on ext4). >> >> 2. EXT4 already supports a feature called bigalloc. In that ext4 can handle >> allocation in cluster size units. So for e.g. we can create a filesystem with >> 4k blocksize but with 64k clustersize. Such a configuration can also be used >> to support atomic writes if the underlying hw device supports it. >> In such case the fsawu_min will most likely be the filesystem blocksize and >> fsawu_max will mostly likely be the cluster size. >> >> So a user can do an atomic write of any size between [fsawu_min, fsawu_max] >> range as long as it satisfies other constraints being laid out by HW device >> (or by software stack) to support atomic writes. >> e.g. len should be a power of 2, pos % len should be naturally >> aligned and [start | end] (phys offsets) should not straddle over >> an atomic write boundary. > > JFYI, I gave this a quick try, and it seems to work ok. Naturally it Thanks John for giving this a try! > suffers from the same issue discussed at > https://lore.kernel.org/linux-fsdevel/434c570e-39b2-4f1c-9b49-ac5241d310ca@oracle.com/ > with regards to writing to partially written extents, which I have tried > to address properly in my v2 for that same series. I did go through other revisions, but I guess I missed going through this series. Thanks Dave & John for your comments over the series. Let me go through the revisions I have missed and John's latest revision. I will update this series accordingly. Appreciate your help! -ritesh