Message ID | 20241206221801.790690-1-kbusch@meta.com (mailing list archive) |
---|---|
Headers | show |
Series | block write streams with nvme fdp | expand |
I just compared this to a crude rebase of what I last sent out, and AFAICS the differences are: 1) basically all new io_uring handling due to the integrity stuff that went in 2) fixes for the NVMe FDP log page parsing 3) drop the support for the remapping of per-partition streams conceptually this all looks fine to me. I'll throw in a few nitpicks on the nvme bits, and I'd need to get up to speed a bit more on the io_uring bits before commenting useful. One thing that came I was pondering for a new version is if statx really is the right vehicle for this as it is a very common fast-path information. If we had a separate streaminfo ioctl or fcntl it might be easier to leave a bit spare space for extensibility. I can try to prototype that or we can leave it as-is because everyone is tired of the series.
On Mon, Dec 09, 2024 at 01:55:11PM +0100, Christoph Hellwig wrote: > I just compared this to a crude rebase of what I last sent out, and > AFAICS the differences are: > > 1) basically all new io_uring handling due to the integrity stuff that > went in > 2) fixes for the NVMe FDP log page parsing > 3) drop the support for the remapping of per-partition streams Yep, pretty much. I will revisit the partition mapping. I just haven't heard any use cases for divvying the streams up this way, so it's not clear to me what the interface needs to provide. > One thing that came I was pondering for a new version is if statx > really is the right vehicle for this as it is a very common fast-path > information. If we had a separate streaminfo ioctl or fcntl it might > be easier to leave a bit spare space for extensibility. I can try to > prototype that or we can leave it as-is because everyone is tired of > the series. Oh sure. I can live without the statx parts from this series if you prefer we take additional time to consider other approaches. We have the sysfs block attributes reporting the same information, and that is okay for now.
Hi Keith! >> 3) drop the support for the remapping of per-partition streams > > Yep, pretty much. I will revisit the partition mapping. I just haven't > heard any use cases for divvying the streams up this way, so it's not > clear to me what the interface needs to provide. Since the streams are a (very) scarce hardware resource, it does seem to me like we should have an explicit interface for an entity (whether app-on-bdev or a filesystem) to allocate them. While there certainly are cases where there is a 1:1 app-to-device mapping, as soon as you add virtualization or enterprise apps to the mix, that assumption quickly falls apart...
On Mon, Dec 09, 2024 at 09:07:35AM -0700, Keith Busch wrote: > Yep, pretty much. I will revisit the partition mapping. I just haven't > heard any use cases for divvying the streams up this way, so it's not > clear to me what the interface needs to provide. Yes, it would be good to understand use cases first. I just threw the patch in as a POC to show we can do it. > > One thing that came I was pondering for a new version is if statx > > really is the right vehicle for this as it is a very common fast-path > > information. If we had a separate streaminfo ioctl or fcntl it might > > be easier to leave a bit spare space for extensibility. I can try to > > prototype that or we can leave it as-is because everyone is tired of > > the series. > > Oh sure. I can live without the statx parts from this series if you > prefer we take additional time to consider other approaches. We have the > sysfs block attributes reporting the same information, and that is okay > for now. I'll try to find some time this afternoon for an interface, but if it doesn't arrive in time we can probably drop if for the next submission.
From: Keith Busch <kbusch@kernel.org> changes from v11: - Place the write hint in an unused io_uring SQE field - Obviates the need to modify the external "attributes" stuff that support PI - Make it a u8 to match the type the block layer supports - And it's just easier to use for the user - Fix the sparse warnings from FDP definitions - Just use the patches that Christoph posted a few weeks ago since it already defined it in a way that makes sparse happy; I just made some minor changes to field names to match what the spec calls them - Actually include the first patch in this series Christoph Hellwig (7): fs: add a write stream field to the kiocb block: add a bi_write_stream field block: introduce a write_stream_granularity queue limit block: expose write streams for block device nodes nvme: add a nvme_get_log_lsi helper nvme: pass a void pointer to nvme_get/set_features for the result nvme.h: add FDP definitions Keith Busch (5): fs: add write stream information to statx block: introduce max_write_streams queue limit io_uring: enable per-io write streams nvme: register fdp parameters with the block layer nvme: use fdp streams if write stream is provided Documentation/ABI/stable/sysfs-block | 15 +++ block/bdev.c | 6 + block/bio.c | 2 + block/blk-crypto-fallback.c | 1 + block/blk-merge.c | 4 + block/blk-sysfs.c | 6 + block/bounce.c | 1 + block/fops.c | 23 ++++ drivers/nvme/host/core.c | 160 ++++++++++++++++++++++++++- drivers/nvme/host/nvme.h | 9 +- fs/stat.c | 2 + include/linux/blk_types.h | 1 + include/linux/blkdev.h | 16 +++ include/linux/fs.h | 1 + include/linux/nvme.h | 77 +++++++++++++ include/linux/stat.h | 2 + include/uapi/linux/io_uring.h | 4 + include/uapi/linux/stat.h | 7 +- io_uring/io_uring.c | 2 + io_uring/rw.c | 1 + 20 files changed, 332 insertions(+), 8 deletions(-)