Message ID | cover.1709356594.git.ritesh.list@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | ext4: Add direct-io atomic write support using fsawu | expand |
On 02/03/2024 07:41, Ritesh Harjani (IBM) wrote: > Hello all, > > This RFC series adds support for atomic writes to ext4 direct-io using > filesystem atomic write unit. It's built on top of John's "block atomic > write v5" series which adds RWF_ATOMIC flag interface to pwritev2() and enables > atomic write support in underlying device driver and block layer. > > This series uses the same RWF_ATOMIC interface for adding atomic write support > to ext4's direct-io path. One can utilize it by 2 of the methods explained below. > ((1)mkfs.ext4 -b <BS>, (2) with bigalloc). > > Filesystem atomic write unit (fsawu): > ============================================ > Atomic writes within ext4 can be supported using below 3 methods - > 1. On a large pagesize system (e.g. Power with 64k pagesize or aarch64 with 64k pagesize), > we can mkfs using different blocksizes. e.g. mkfs.ext4 -b <4k/8k/16k/32k/64k). > Now if the underlying HW device supports atomic writes, than a corresponding > blocksize can be chosen as a filesystem atomic write unit (fsawu) which > should be within the underlying hw defined [awu_min, awu_max] range. > For such filesystem, fsawu_[min|max] both are equal to blocksize (e.g. 16k) > > On a smaller pagesize system this can be utilized when support for LBS is > complete (on ext4). > > 2. EXT4 already supports a feature called bigalloc. In that ext4 can handle > allocation in cluster size units. So for e.g. we can create a filesystem with > 4k blocksize but with 64k clustersize. Such a configuration can also be used > to support atomic writes if the underlying hw device supports it. > In such case the fsawu_min will most likely be the filesystem blocksize and > fsawu_max will mostly likely be the cluster size. > > So a user can do an atomic write of any size between [fsawu_min, fsawu_max] > range as long as it satisfies other constraints being laid out by HW device > (or by software stack) to support atomic writes. > e.g. len should be a power of 2, pos % len should be naturally > aligned and [start | end] (phys offsets) should not straddle over > an atomic write boundary. JFYI, I gave this a quick try, and it seems to work ok. Naturally it suffers from the same issue discussed at https://lore.kernel.org/linux-fsdevel/434c570e-39b2-4f1c-9b49-ac5241d310ca@oracle.com/ with regards to writing to partially written extents, which I have tried to address properly in my v2 for that same series. Thanks, John
John Garry <john.g.garry@oracle.com> writes: > On 02/03/2024 07:41, Ritesh Harjani (IBM) wrote: >> Hello all, >> >> This RFC series adds support for atomic writes to ext4 direct-io using >> filesystem atomic write unit. It's built on top of John's "block atomic >> write v5" series which adds RWF_ATOMIC flag interface to pwritev2() and enables >> atomic write support in underlying device driver and block layer. >> >> This series uses the same RWF_ATOMIC interface for adding atomic write support >> to ext4's direct-io path. One can utilize it by 2 of the methods explained below. >> ((1)mkfs.ext4 -b <BS>, (2) with bigalloc). >> >> Filesystem atomic write unit (fsawu): >> ============================================ >> Atomic writes within ext4 can be supported using below 3 methods - >> 1. On a large pagesize system (e.g. Power with 64k pagesize or aarch64 with 64k pagesize), >> we can mkfs using different blocksizes. e.g. mkfs.ext4 -b <4k/8k/16k/32k/64k). >> Now if the underlying HW device supports atomic writes, than a corresponding >> blocksize can be chosen as a filesystem atomic write unit (fsawu) which >> should be within the underlying hw defined [awu_min, awu_max] range. >> For such filesystem, fsawu_[min|max] both are equal to blocksize (e.g. 16k) >> >> On a smaller pagesize system this can be utilized when support for LBS is >> complete (on ext4). >> >> 2. EXT4 already supports a feature called bigalloc. In that ext4 can handle >> allocation in cluster size units. So for e.g. we can create a filesystem with >> 4k blocksize but with 64k clustersize. Such a configuration can also be used >> to support atomic writes if the underlying hw device supports it. >> In such case the fsawu_min will most likely be the filesystem blocksize and >> fsawu_max will mostly likely be the cluster size. >> >> So a user can do an atomic write of any size between [fsawu_min, fsawu_max] >> range as long as it satisfies other constraints being laid out by HW device >> (or by software stack) to support atomic writes. >> e.g. len should be a power of 2, pos % len should be naturally >> aligned and [start | end] (phys offsets) should not straddle over >> an atomic write boundary. > > JFYI, I gave this a quick try, and it seems to work ok. Naturally it Thanks John for giving this a try! > suffers from the same issue discussed at > https://lore.kernel.org/linux-fsdevel/434c570e-39b2-4f1c-9b49-ac5241d310ca@oracle.com/ > with regards to writing to partially written extents, which I have tried > to address properly in my v2 for that same series. I did go through other revisions, but I guess I missed going through this series. Thanks Dave & John for your comments over the series. Let me go through the revisions I have missed and John's latest revision. I will update this series accordingly. Appreciate your help! -ritesh
On 02/03/2024 07:41, Ritesh Harjani (IBM) wrote: Hi Ritesh, > Hello all, > > This RFC series adds support for atomic writes to ext4 direct-io using > filesystem atomic write unit. It's built on top of John's "block atomic > write v5" series which adds RWF_ATOMIC flag interface to pwritev2() and enables > atomic write support in underlying device driver and block layer. I am curious - do you have any plans to progress this work? John
John Garry <john.g.garry@oracle.com> writes: > On 02/03/2024 07:41, Ritesh Harjani (IBM) wrote: > > Hi Ritesh, > >> Hello all, >> >> This RFC series adds support for atomic writes to ext4 direct-io using >> filesystem atomic write unit. It's built on top of John's "block atomic >> write v5" series which adds RWF_ATOMIC flag interface to pwritev2() and enables >> atomic write support in underlying device driver and block layer. > > I am curious - do you have any plans to progress this work? > Yes John. I have resumed my work on the interfaces changes for direct-io atomic write for ext4 (hence all the queries on the other email). We do intend to get this going. Meanwhile Ojaswin has been working on extsize feature for ext4 (similar to XFS). It uses some of our previous mballoc order-0 allocation work, to support aligned allocations. The patch series is almost in it's final stages. He will be soon be posting an initial RFC design of the same (hopefully by next week). Thanks again for your help! -ritesh