Message ID | cover.1733384171.git.naohiro.aota@wdc.com (mailing list archive) |
---|---|
Headers | show |
Series | btrfs: zoned: split out data relocation space_info | expand |
On 05.12.24 08:50, Naohiro Aota wrote: > As discussed in [1], there is a longstanding early ENOSPC issue on the > zoned mode even with simple fio script. This is also causing blktests > zbd/009 to fail [2]. > > [1] https://lore.kernel.org/linux-btrfs/cover.1731571240.git.naohiro.aota@wdc.com/ > [2] https://github.com/osandov/blktests/issues/150 > > This series is the second part to fix the ENOSPC issue. This series > introduces "space_info sub-space" and use it split a space_info for data > relocation block group. > > Current code assumes we have only one space_info for each block group type > (DATA, METADATA, and SYSTEM). We sometime needs multiple space_info to > manage special block groups. > > One example is handling the data relocation block group for the zoned mode. > That block group is dedicated for writing relocated data and we cannot > allocate any regular extent from that block group, which is implemented in > the zoned extent allocator. That block group still belongs to the normal > data space_info. So, when all the normal data block groups are full and > there are some free space in the dedicated block group, the space_info > looks to have some free space, while it cannot allocate normal extent > anymore. That results in a strange ENOSPC error. We need to have a > space_info for the relocation data block group to represent the situation > properly. I like the idea and the patches, but I'm a bit concerned it diverges zoned and non-zoned btrfs quite a bit in handling relocation. I'd be interested what David and Josef think of it. If no one objects to have these sub-space_infos zoned specific I'm all good with it as it fixes real problems. Would it be useful to also do the same for regular btrfs? And while we're at it, the treelog block-group for zoned mode could benefit form a own space-info as well, couldn't it? To not run into premature ENOSPC on frequent syncs, or is this unlikely to happen (I'm thinking out loud here). Byte, Johannes
On Sat, Dec 07, 2024 at 11:35:04AM +0000, Johannes Thumshirn wrote: > On 05.12.24 08:50, Naohiro Aota wrote: > > As discussed in [1], there is a longstanding early ENOSPC issue on the > > zoned mode even with simple fio script. This is also causing blktests > > zbd/009 to fail [2]. > > > > [1] https://lore.kernel.org/linux-btrfs/cover.1731571240.git.naohiro.aota@wdc.com/ > > [2] https://github.com/osandov/blktests/issues/150 > > > > This series is the second part to fix the ENOSPC issue. This series > > introduces "space_info sub-space" and use it split a space_info for data > > relocation block group. > > > > Current code assumes we have only one space_info for each block group type > > (DATA, METADATA, and SYSTEM). We sometime needs multiple space_info to > > manage special block groups. > > > > One example is handling the data relocation block group for the zoned mode. > > That block group is dedicated for writing relocated data and we cannot > > allocate any regular extent from that block group, which is implemented in > > the zoned extent allocator. That block group still belongs to the normal > > data space_info. So, when all the normal data block groups are full and > > there are some free space in the dedicated block group, the space_info > > looks to have some free space, while it cannot allocate normal extent > > anymore. That results in a strange ENOSPC error. We need to have a > > space_info for the relocation data block group to represent the situation > > properly. > > I like the idea and the patches, but I'm a bit concerned it diverges > zoned and non-zoned btrfs quite a bit in handling relocation. I'd be > interested what David and Josef think of it. If no one objects to have > these sub-space_infos zoned specific I'm all good with it as it fixes > real problems. Hmm, for that point, we already do the relocation a bit different on zoned vs non-zoned. On the zoned mode, the relocated data is always allocated from a dedicated block group. This series just move that block group into its own space_info (aka sub-space_info) to fix a space accounting issue. I admit the concept of sub-space_info is new, so I'd like to hear David and Josef's opinion on it. > > Would it be useful to also do the same for regular btrfs? And while > we're at it, the treelog block-group for zoned mode could benefit form a > own space-info as well, couldn't it? To not run into premature ENOSPC on > frequent syncs, or is this unlikely to happen (I'm thinking out loud here). On the regular mode, it can allocate space for relocation data from any block group. So, it is not so useful to separate space_info for that. However, it would be interesting to have a dedicated relocation data block group as well on the regular mode, because it could reduce the fragmentation of relocated data. This would be interesting topic to explore. Yes, adding treelog sub-space_info is useful too. Apparently, I sometime see a test failure due to treelog space_info accounting mismatch. I didn't implement that sub-space_info for now, because I'd like to know first that the sub-space_info concept itself is the right way to go. > > Byte, > Johannes